Submitted:
27 November 2025
Posted:
28 November 2025
You are already at the latest version
Abstract
Keywords:
1. Introduction
2. Materials and Methods
2.1. Study Samples and Data Sources
2.2. Experimental Design and Control Settings
2.3. Evaluation Procedure and Quality Control
2.4. Data Processing and Model Equations
2.5. Computing Environment and Reproducibility
3. Results and Discussion
3.1. Overall Performance Across the Three Benchmarks

3.2. Performance Differences Across Text, Code, and Table Queries
3.3. Influence of Context Budget and Reconstruction Module

3.4. Comparison with Other Multi-Source RAG Approaches
4. Conclusion
References
- Al-Qudah, O. (2025). Application of Retrieval-Augmented Generation (RAG) In Domain-Specific Question-Answering Systems (Master's thesis, Princess Sumaya University for Technology (Jordan)).
- Palmer, N. (2017). Best Practices for Knowledge Workers: Innovation in Adaptive Case Management: Innovation in Adaptive Case Management. Future Strategies Inc..
- Petroni, F., Piktus, A., Fan, A., Lewis, P., Yazdani, M., De Cao, N., ... & Riedel, S. (2021, June). KILT: a benchmark for knowledge intensive language tasks. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (pp. 2523-2544).
- Rao, T. R., Mitra, P., Bhatt, R., & Goswami, A. (2019). The big data system, components, tools, and technologies: a survey. Knowledge and Information Systems, 60(3), 1165-1245. [CrossRef]
- Chen, S. A., Miculicich, L., Eisenschlos, J., Wang, Z., Wang, Z., Chen, Y., ... & Pfister, T. (2024). Tablerag: Million-token table understanding with language models. Advances in Neural Information Processing Systems, 37, 74899-74921.
- Rai, S., Belwal, R. C., & Gupta, A. (2022). A review on source code documentation. ACM Transactions on Intelligent Systems and Technology (TIST), 13(5), 1-44. [CrossRef]
- Li, S., & Ramakrishnan, N. (2025, July). Oreo: A plug-in context reconstructor to enhance retrieval-augmented generation. In Proceedings of the 2025 International ACM SIGIR Conference on Innovative Concepts and Theories in Information Retrieval (ICTIR) (pp. 238-253).
- Pal, V., Lassance, C., Déjean, H., & Clinchant, S. (2023, March). Parameter-efficient sparse retrievers and rerankers using adapters. In European Conference on Information Retrieval (pp. 16-31). Cham: Springer Nature Switzerland.
- Ding, Y., Wu, Y., & Ding, Z. (2025). An automatic patent literature retrieval system based on llm-rag. arXiv preprint arXiv:2508.14064. [CrossRef]
- Gao, Z., Qu, Y., & Han, Y. (2025). Cross-Lingual Sponsored Search via Dual-Encoder and Graph Neural Networks for Context-Aware Query Translation in Advertising Platforms. arXiv preprint arXiv:2510.22957.
- Jin, J., Su, Y., & Zhu, X. (2025). SmartMLOps Studio: Design of an LLM-Integrated IDE with Automated MLOps Pipelines for Model Development and Monitoring. arXiv preprint arXiv:2511.01850.
- Yin, Z., Chen, X., & Zhang, X. (2025). AI-Integrated Decision Support System for Real-Time Market Growth Forecasting and Multi-Source Content Diffusion Analytics. arXiv preprint arXiv:2511.09962.
- Liang, R., Ye, Z., Liang, Y., & Li, S. (2025). Deep Learning-Based Player Behavior Modeling and Game Interaction System Optimization Research.
- Wu, C., Zhang, F., Chen, H., & Zhu, J. (2025). Design and optimization of low power persistent logging system based on embedded Linux.
- Zhu, W., Yao, Y., & Yang, J. (2025). Optimizing Financial Risk Control for Multinational Projects: A Joint Framework Based on CVaR-Robust Optimization and Panel Quantile Regression.
- Wang, J., & Xiao, Y. (2025). Research on Transfer Learning and Algorithm Fairness Calibration in Cross-Market Credit Scoring.
- Pal, V., Lassance, C., Déjean, H., & Clinchant, S. (2023, March). Parameter-efficient sparse retrievers and rerankers using adapters. In European Conference on Information Retrieval (pp. 16-31). Cham: Springer Nature Switzerland.
- Gu, X., Liu, M., & Yang, J. (2025). Application and Effectiveness Evaluation of Federated Learning Methods in Anti-Money Laundering Collaborative Modeling Across Inter-Institutional Transaction Networks.
- Machidon, A. L., & Pejović, V. (2023). Deep learning for compressive sensing: a ubiquitous systems perspective. Artificial Intelligence Review, 56(4), 3619-3658. [CrossRef]
- Wu, Q., Shao, Y., Wang, J., & Sun, X. (2025). Learning Optimal Multimodal Information Bottleneck Representations. arXiv preprint arXiv:2505.19996.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).