Preprint
Article

This version is not peer-reviewed.

Contextualized Diverse Reasoning: Enhancing Video Question Answering with Multi-Perspective MLLM Pathways

Submitted:

23 December 2025

Posted:

05 January 2026

You are already at the latest version

Abstract
Video Question Answering (VideoQA) presents significant challenges, demanding comprehensive understanding of dynamic visual content, object interactions, and complex temporal-causal logic. While Multimodal Large Language Models (MLLMs) offer powerful reasoning capabilities, existing approaches often provide singular, potentially flawed reasoning paths, limiting the robustness and depth of VideoQA models. To address these limitations, we propose Contextualized Diverse Reasoning (CDR), a novel framework designed to furnish VideoQA models with richer, multi-perspective auxiliary supervision. CDR comprises three key innovations: a Diverse Reasoning Generator that leverages MLLMs with distinct viewpoint prompts to generate multiple, complementary reasoning pathways; a Reasoning Pathway Refiner and Annotator that purifies these paths by removing explicit answers and enriching them with semantic type annotations; and a Context-Aware Reasoning Fusion module that dynamically integrates these refined, multi-dimensional reasoning cues with video and question features using an attention-based mechanism. Extensive experiments on several benchmark datasets demonstrate that CDR consistently achieves state-of-the-art performance, outperforming leading VideoQA models and MLLM-based methods. Our ablation studies confirm the crucial role of each CDR component, while qualitative analysis and human evaluations further validate the superior correctness of answers and the coherence, completeness, and helpfulness of the generated reasoning pathways.
Keywords: 
;  ;  ;  ;  
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

Disclaimer

Terms of Use

Privacy Policy

Privacy Settings

© 2026 MDPI (Basel, Switzerland) unless otherwise stated