Submitted:
03 April 2025
Posted:
04 April 2025
You are already at the latest version
Abstract
Keywords:
1. Introduction
- A detailed review of the principles and motivations behind retrieval-augmented generation for large language models.
- An in-depth discussion of key techniques for retrieval and generation, including both traditional and cutting-edge methods.
- A thorough examination of the state-of-the-art applications of RAG in various domains, such as question answering, summarization, and dialogue generation.
- A critical analysis of the challenges and limitations associated with RAG, including retrieval efficiency, document quality, and coherence of generated responses.
- An exploration of future research directions and potential applications of RAG, with an emphasis on areas where further improvements can be made to enhance model performance [13].
2. Background and Related Work
2.1. Retrieval Techniques for Information Retrieval
2.1.1. Dense Retrieval Models
2.1.2. Sparse Retrieval Models
2.2. Generative Models for Text Generation
2.2.1. Autoregressive Models
2.2.2. Encoder-Decoder Models
2.2.3. Challenges with Generative Models
2.3. Retrieval-Augmented Generation (RAG) Models
2.3.1. Retrieval-Augmented Generation Framework
2.4. Related Work in Retrieval-Augmented Generation
3. Retrieval-Augmented Generation: Techniques and Architectures
3.1. Retrieval Mechanisms in RAG Models
3.1.1. Sparse Retrieval Methods
3.1.2. Dense Retrieval Methods
3.2. Integrating Retrieval with Generation
3.2.1. Simple Concatenation
3.2.2. Fusion-in-Decoder (FiD)
3.2.3. Retrieval-Augmented Generation with Attention Mechanisms
3.3. Advancements in Retrieval-Augmented Generation
4. Applications of Retrieval-Augmented Generation
4.1. Open-Domain Question Answering
4.1.1. RAG for Open-Domain QA
- Retrieval: Given a question q, a retrieval mechanism (usually a dense retrieval model like DPR) retrieves a set of relevant documents from the corpus [51].
- Generation: The retrieved documents are then passed to a generative model, which generates an answer a based on the question and the retrieved documents.
4.1.2. Benchmarks and Performance
4.2. Text Summarization
4.2.1. RAG for Abstractive Summarization
4.2.2. RAG for Extractive Summarization
4.2.3. Evaluating Summarization Systems
4.3. Dialogue Systems
4.3.1. RAG for Dialogue Generation
4.3.2. Challenges in Dialogue Systems
4.4. Specialized Domains: Medical and Legal Applications
4.4.1. RAG in Medical Applications
4.4.2. RAG in Legal Applications
4.5. Conclusion
5. Challenges and Open Issues in Retrieval-Augmented Generation
5.1. Challenges in Retrieval Quality
5.1.1. Irrelevant or Noisy Documents
5.1.2. Out-of-Distribution and Unknown Information
5.1.3. Evaluation of Retrieval Quality
5.2. Challenges in Information Integration
5.2.1. Handling Ambiguity and Conflicting Information
5.2.2. Complexity of Combining Multiple Sources
5.3. Scalability and Efficiency
5.3.1. Efficient Retrieval and Generation
5.3.2. Real-Time Retrieval in Open-Domain Systems
5.4. Ethical and Fairness Considerations
5.4.1. Bias in Retrieval
5.4.2. Misinformation and Hallucination
5.5. Ongoing Research Directions
- Improved Retrieval Techniques: Continued advancements in retrieval models, such as more efficient dense retrieval methods or hybrid retrieval models, can help improve the quality and relevance of the retrieved documents [91].
- Enhanced Information Integration: Research into better methods for handling conflicting information and fusing multiple retrieved sources could lead to more coherent and accurate outputs.
- Scalable Architectures: More scalable architectures for RAG systems, including distributed retrieval and generation, will be crucial to deploy these models in real-time applications [92].
- Fairness and Bias Mitigation: Developing techniques for bias detection and correction, as well as methods for ensuring fairness in retrieval and generation, is essential for building ethical RAG systems [93].
6. Conclusion and Future Directions
6.1. Summary of Key Points
- The Concept of Retrieval-Augmented Generation: RAG models combine a retrieval mechanism with a generative model to enhance the quality of responses by grounding them in external knowledge sources.
- Applications of RAG: We discussed the various domains in which RAG models have been successfully applied, such as open-domain question answering, text summarization, and dialogue systems. These models have shown significant improvements over traditional methods by utilizing dynamic, real-time information from vast corpora.
- Challenges in RAG Models: We identified several challenges, including the quality of retrieved documents, the integration of external knowledge into the generation process, issues with scalability, and ethical concerns such as bias and misinformation.
- Evaluation and Benchmarks: RAG models have been evaluated on standard NLP tasks, demonstrating improvements in accuracy and relevance. However, new evaluation metrics are needed to better assess the quality of retrieval, the integration of knowledge, and the ethical implications of the generated output.
6.2. Future Directions
6.2.1. Improved Retrieval Mechanisms
- Hybrid Retrieval Models: Exploring hybrid models that combine sparse (e.g., TF-IDF, BM25) and dense (e.g., DPR, Sentence-BERT) retrieval techniques could further enhance the accuracy and efficiency of the retrieval process.
- End-to-End Retrieval-Augmented Systems: There is an opportunity to develop end-to-end architectures that seamlessly combine retrieval and generation processes, reducing the reliance on separate components and improving system integration.
- Context-Aware Retrieval: Current retrieval methods do not always consider the full conversational or document context. Future retrieval models should account for broader context to improve the relevance of retrieved documents, especially in multi-turn dialogue or long document summarization tasks.
6.2.2. Enhanced Information Fusion and Reasoning
- Multi-Document Reasoning: Building models that can handle and reason over multiple documents simultaneously, allowing them to synthesize diverse pieces of information into a coherent and contextually grounded output.
- Knowledge Graphs and Structured Data: Incorporating structured knowledge from external sources, such as knowledge graphs or databases, could help RAG models reason more effectively and make better decisions based on factual relationships between entities.
- Explainable Generation: Research into explainable AI for generative models could allow RAG systems to provide explanations for their retrieved knowledge and reasoning, increasing trust and transparency in applications such as healthcare or law.
6.2.3. Scalability and Efficiency Improvements
- Optimized Retrieval Pipelines: New techniques to speed up the retrieval process without compromising accuracy are necessary, especially in real-time applications. Methods such as quantization, pruning, and approximate nearest neighbor (ANN) search can be further optimized for RAG systems.
- Model Distillation and Compression: To reduce computational cost, distilling large RAG models into smaller, more efficient variants while retaining performance will be crucial for deploying these models in resource-constrained environments[94].
- Efficient Inference: Developing new inference techniques to streamline both the retrieval and generation phases of RAG systems, allowing them to scale up to larger corpora and deliver responses in real-time, is a key challenge for the next generation of systems.
6.2.4. Ethical Considerations and Fairness
- Bias Mitigation in Retrieval and Generation: Investigating methods to detect and mitigate biases both in the retrieval stage (e.g., biased corpora) and in the generative process (e.g., biased language models) will be key to ensuring fairness and equity in RAG systems.
- Misinformation Detection and Prevention: Developing strategies to detect and prevent the generation of false or misleading content is essential, particularly in high-stakes domains such as medical, legal, or financial advice.
- Transparency and Accountability: Research into mechanisms for improving transparency in RAG models, such as providing insights into the decision-making process and the sources of retrieved information, will be critical in building user trust and accountability in AI systems.
6.2.5. Personalized and Multi-Domain Systems
6.3. Final Thoughts
References
- Madaan, A.; Tandon, N.; Gupta, P.; Hallinan, S.; Gao, L.; Wiegreffe, S.; Alon, U.; Dziri, N.; Prabhumoye, S.; Yang, Y.; et al. arXiv 2023, arXiv:cs.CL/2303.17651.
- Thakur, N.; Reimers, N.; Rücklé, A.; Srivastava, A.; Gurevych, I. Beir: A heterogenous benchmark for zero-shot evaluation of information retrieval models. arXiv 2021, arXiv:2104.08663. [Google Scholar]
- Narayan, S.; Cohen, S.B.; Lapata, M. Don’t Give Me the Details, Just the Summary! 2018, arXiv:cs.CL/1808.08745. [Google Scholar]
- Wang, H.; Hu, M.; Deng, Y.; Wang, R.; Mi, F.; Wang, W.; Wang, Y.; Kwan, W.C.; King, I.; Wong, K.F. Large Language Models as Source Planner for Personalized Knowledge-grounded Dialogue. arXiv 2023, arXiv:2310.08840. [Google Scholar]
- Fan, A.; Jernite, Y.; Perez, E.; Grangier, D.; Weston, J.; Auli, M. ELI5: Long form question answering. arXiv 2019, arXiv:1907.09190. [Google Scholar]
- Nebel, B. On the compilability and expressive power of propositional planning formalisms. Journal of Artificial Intelligence Research 2000, 12, 271–315. [Google Scholar] [CrossRef]
- Trivedi, H.; Balasubramanian, N.; Khot, T.; Sabharwal, A. Interleaving retrieval with chain-of-thought reasoning for knowledge-intensive multi-step questions. arXiv 2022, arXiv:2212.10509 2022. [Google Scholar]
- Baek, J.; Aji, A.F.; Saffari, A. Knowledge-Augmented Language Model Prompting for Zero-Shot Knowledge Graph Question Answering. arXiv, 2023; arXiv:2306.04136 2023. [Google Scholar]
- Welbl, J.; Stenetorp, P.; et al. 2WikiMultiHopQA: Multihop Question Answering over Wikipedia Articles. EMNLP 2018. [Google Scholar]
- Kwiatkowski, T.; Palomaki, J.; Redfield, O.; Collins, M.; Parikh, A.; Alberti, C.; Epstein, D.; Polosukhin, I.; Devlin, J.; Lee, K.; et al. Natural questions: a benchmark for question answering research. Transactions of the Association for Computational Linguistics 2019, 7, 453–466. [Google Scholar] [CrossRef]
- Saad-Falcon, J.; Khattab, O.; Potts, C.; Zaharia, M. ARES: An Automated Evaluation Framework for Retrieval-Augmented Generation Systems. arXiv 2023, arXiv:2311.09476. [Google Scholar]
- Saha, A.; Pahuja, V.; Khapra, M.M.; Sankaranarayanan, K.; Chandar, S. Complex Sequential Question Answering: Towards Learning to Converse Over Linked Question Answer Pairs with a Knowledge Graph 2018. arXiv 2018, arXiv:1801.10314. [Google Scholar]
- LangChain. LangSmith: The Ultimate Toolkit for Debugging and Monitoring LLM Applications. https://www.langchain.com/langsmith, 2025. Accessed: 2025-01-28.
- Geva, M.; Khashabi, D.; Segal, E.; Khot, T.; Roth, D.; Berant, J. Did Aristotle Use a Laptop? 2021, arXiv:cs.CL/2101.02235. [Google Scholar]
- Qin, Y.; Cai, Z.; Jin, D.; Yan, L.; Liang, S.; Zhu, K.; Lin, Y.; Han, X.; Ding, N.; Wang, H.; et al. WebCPM: Interactive Web Search for Chinese Long-form Question Answering. arXiv 2023, arXiv:2305.06849. [Google Scholar]
- Li, S.; Ji, H.; Han, J. Document-Level Event Argument Extraction by Conditional Generation. 2021, arXiv:cs.CL/2104.05919. [Google Scholar]
- Pan, F.; Canim, M.; Glass, M.; Gliozzo, A.; Hendler, J. End-to-End Table Question Answering via Retrieval-Augmented Generation. arXiv 2022, arXiv:2203.16714. [Google Scholar]
- Li, X.; Zhao, R.; Chia, Y.K.; Ding, B.; Bing, L.; Joty, S.; Poria, S. Chain of Knowledge: A Framework for Grounding Large Language Models with Structured Knowledge Bases. arXiv 2023, arXiv:2305.13269. [Google Scholar]
- Lan, T.; Cai, D.; Wang, Y.; Huang, H.; Mao, X.L. Copy is All You Need. In Proceedings of the The Eleventh International Conference on Learning Representations; 2022. [Google Scholar]
- Khattab, O.; Zaharia, M. Colbert: Efficient and effective passage search via contextualized late interaction over bert. In Proceedings of the Proceedings of the 43rd International ACM SIGIR conference on research and development in Information Retrieval; 2020; pp. 39–48. [Google Scholar]
- Ranzato, M.; Chopra, S.; Auli, M.; Zaremba, W. Sequence level training with recurrent neural networks. arXiv 2015, arXiv:1511.06732. [Google Scholar]
- Lyu, Y.; Li, Z.; Niu, S.; Xiong, F.; Tang, B.; Wang, W.; Wu, H.; Liu, H.; Xu, T.; Chen, E. CRUD-RAG: A comprehensive chinese benchmark for retrieval-augmented generation of large language models. arXiv arXiv:2401.17043. [CrossRef]
- Chen, D.; Yih, W.t. Open-domain question answering. In Proceedings of the Proceedings of the 58th annual meeting of the association for computational linguistics: tutorial abstracts; 2020; pp. 34–37. [Google Scholar]
- Cheng, X.; Gao, S.; Liu, L.; Zhao, D.; Yan, R. Neural machine translation with contrastive translation memories. arXiv 2022, arXiv:2212.03140. [Google Scholar]
- Zheng, H.S.; Mishra, S.; Chen, X.; Cheng, H.T.; Chi, E.H.; Le, Q.V.; Zhou, D. Take a Step Back: Evoking Reasoning via Abstraction in Large Language Models. arXiv 2023, arXiv:2310.06117. [Google Scholar]
- VoyageAI. Voyage’s embedding models. https://docs.voyageai.com/embeddings/, 2023.
- Xia, M.; Huang, G.; Liu, L.; Shi, S. Graph based translation memory for neural machine translation. In Proceedings of the Proceedings of the AAAI conference on artificial intelligence; 2019; 33, pp. 7297–7304. [Google Scholar]
- Zhao, W.X.; Liu, J.; Ren, R.; Wen, J.R. Dense text retrieval based on pretrained language models: A survey. ACM Transactions on Information Systems 2024, 42, 1–60. [Google Scholar] [CrossRef]
- Zhang, Y.; Li, Y.; Cui, L.; Cai, D.; Liu, L.; Fu, T.; Huang, X.; Zhao, E.; Zhang, Y.; Chen, Y.; et al. Siren’s Song in the AI Ocean: A Survey on Hallucination in Large Language Models. arXiv 2023, arXiv:2309.01219. [Google Scholar]
- Elsahar, H.; Vougiouklis, P.; Remaci, A.; Gravier, C.; Hare, J.; Laforest, F.; Simperl, E. T-rex: A large scale alignment of natural language with knowledge base triples. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018); 2018. [Google Scholar]
- Yoran, O.; Wolfson, T.; Ram, O.; Berant, J. Making retrieval-augmented language models robust to irrelevant context. arXiv 2023, arXiv:2310.01558. [Google Scholar]
- DeepLearning.AI. How Agents Can Improve LLM Performance. https://www.deeplearning.ai/the-batch/how-agents-can-improve-llm-performance/?ref=dl-staging-website.ghost.io, 2024. Accessed: 2025-01-13.
- Yang, Z.; Qi, P.; Zhang, S.; Bengio, Y.; Cohen, W.W.; Salakhutdinov, R.; Manning, C.D. 2018, arXiv:cs.CL/1809.09600.
- Melz, E. Enhancing llm intelligence with arm-rag: Auxiliary rationale memory for retrieval augmented generation. arXiv 2023, arXiv:2311.04177. [Google Scholar]
- Zhang, P.; Xiao, S.; Liu, Z.; Dou, Z.; Nie, J.Y. Retrieve Anything To Augment Large Language Models. arXiv 2023, arXiv:2310.07554. [Google Scholar]
- Shi, T.; Li, L.; Lin, Z.; Yang, T.; Quan, X.; Wang, Q. Dual-Feedback Knowledge Retrieval for Task-Oriented Dialogue Systems. arXiv 2023, arXiv:2310.14528. [Google Scholar]
- Yan, S.Q.; Gu, J.C.; Zhu, Y.; Ling, Z.H. Corrective Retrieval Augmented Generation. arXiv 2024, arXiv:2401.15884. [Google Scholar]
- Kim, G.; Kim, S.; Jeon, B.; Park, J.; Kang, J. Tree of Clarifications: Answering Ambiguous Questions with Retrieval-Augmented Large Language Models. arXiv 2023, arXiv:2310.14696. [Google Scholar]
- Xu, P.; Ping, W.; Wu, X.; McAfee, L.; Zhu, C.; Liu, Z.; Subramanian, S.; Bakhturina, E.; Shoeybi, M.; Catanzaro, B. Retrieval meets long context large language models. arXiv 2023, arXiv:2310.03025. [Google Scholar]
- Kotonya, N.; Toni, F. Explainable Automated Fact-Checking for Public Health Claims. 2020, arXiv:cs.CL/2010.09926. [Google Scholar]
- Repository, L.D. Research Paper Report Generation Workflow using LlamaCloud. https://github.com/run-llama/llamacloud-demo/blob/main/examples/report_generation/research_paper_report_generation.ipynb, 2025. Accessed: 2025-01-13.
- Wang, Z.; Araki, J.; Jiang, Z.; Parvez, M.R.; Neubig, G. Learning to filter context for retrieval-augmented generation. arXiv 2023, arXiv:2311.08377. [Google Scholar]
- Robertson, S.; Zaragoza, H.; et al. The probabilistic relevance framework: BM25 and beyond. Foundations and Trends® in Information Retrieval 2009, 3, 333–389. [Google Scholar] [CrossRef]
- Zhao, W.X.; Zhou, K.; Li, J.; Tang, T.; Wang, X.; Hou, Y.; Min, Y.; Zhang, B.; Zhang, J.; Dong, Z.; et al. A Survey of Large Language Models. 2024, arXiv:cs.CL/2303.18223. [Google Scholar]
- Shao, Z.; Gong, Y.; Shen, Y.; Huang, M.; Duan, N.; Chen, W. Enhancing Retrieval-Augmented Large Language Models with Iterative Retrieval-Generation Synergy. arXiv 2023, arXiv:2305.15294. [Google Scholar]
- Yoran, O.; Wolfson, T.; Ram, O.; Berant, J. Making retrieval-augmented language models robust to irrelevant context. arXiv 2023, arXiv:2310.01558. [Google Scholar]
- Ovadia, O.; Brief, M.; Mishaeli, M.; Elisha, O. Fine-tuning or retrieval? comparing knowledge injection in llms. arXiv 2023, arXiv:2312.05934. [Google Scholar]
- Nguyen, T.; Rosenberg, M.; Song, X.; Gao, J.; Tiwary, S.; Majumder, R.; Deng, L. Ms marco: A human-generated machine reading comprehension dataset. 2016. [Google Scholar]
- Purwar, A.; Sundar, R. Keyword Augmented Retrieval: Novel framework for Information Retrieval integrated with speech interface. arXiv 2023, arXiv:2310.04205. [Google Scholar]
- Sun, Z.; Wang, X.; Tay, Y.; Yang, Y.; Zhou, D. Recitation-augmented language models. arXiv 2022, arXiv:2210.01296. [Google Scholar]
- Xiao, G.; Tian, Y.; Chen, B.; Han, S.; Lewis, M. Efficient streaming language models with attention sinks. arXiv 2023, arXiv:2309.17453. [Google Scholar]
- Gou, Z.; Shao, Z.; Gong, Y.; Shen, Y.; Yang, Y.; Duan, N.; Chen, W. 2024, arXiv:cs.CL/2305.11738.
- Cheng, X.; Luo, D.; Chen, X.; Liu, L.; Zhao, D.; Yan, R. Lift Yourself Up: Retrieval-augmented Text Generation with Self Memory. arXiv 2023, arXiv:2305.02437. [Google Scholar]
- Raudaschl, A.H. Forget RAG, the Future is RAG-Fusion. https://towardsdatascience.com/forget-rag-the-future-is-rag-fusion-1147298d8ad1, 2023.
- Cerny, T.; Abdelfattah, A.S.; Bushong, V.; Maruf, A.A.; Taibi, D. Microservice Architecture Reconstruction and Visualization Techniques: A Review. 2022, arXiv:cs.SE/2207.02988. [Google Scholar]
- He, X.; Tian, Y.; Sun, Y.; Chawla, N.V.; Laurent, T.; LeCun, Y.; Bresson, X.; Hooi, B. G-Retriever: Retrieval-Augmented Generation for Textual Graph Understanding and Question Answering. 2024, arXiv:cs.LG/2402.07630. [Google Scholar]
- Kapoor, S.; Stroebl, B.; Siegel, Z.S.; Nadgir, N.; Narayanan, A. AI Agents That Matter. 2024, arXiv:cs.LG/2407.01502. [Google Scholar]
- Pang, R.Y.; Parrish, A.; Joshi, N.; Nangia, N.; Phang, J.; Chen, A.; Padmakumar, V.; Ma, J.; Thompson, J.; He, H.; et al. QuALITY: Question answering with long input texts, yes! arXiv 2021, arXiv:2112.08608. [Google Scholar]
- Luo, Z.; Xu, C.; Zhao, P.; Geng, X.; Tao, C.; Ma, J.; Lin, Q.; Jiang, D. Augmented Large Language Models with Parametric Knowledge Guiding. arXiv 2023, arXiv:2305.04757. [Google Scholar]
- Mavromatis, C.; Karypis, G. GNN-RAG: Graph Neural Retrieval for Large Language Model Reasoning. 2024, arXiv:cs.CL/2405.20139. [Google Scholar]
- Singh, A. A Survey of AI Text-to-Image and AI Text-to-Video Generators. In Proceedings of the 2023 4th International Conference on Artificial Intelligence, Robotics and Control (AIRC); 2023; pp. 32–36. [Google Scholar] [CrossRef]
- Cobbe, K.; Kosaraju, V.; Bavarian, M.; Chen, M.; Jun, H.; Kaiser, L.; Plappert, M.; Tworek, J.; Hilton, J.; Nakano, R.; et al. Training Verifiers to Solve Math Word Problems. 2021, arXiv:cs.LG/2110.14168. [Google Scholar]
- Socher, R.; Perelygin, A.; Wu, J.; Chuang, J.; Manning, C.D.; Ng, A.Y.; Potts, C. Recursive deep models for semantic compositionality over a sentiment treebank. In Proceedings of the Proceedings of the 2013 conference on empirical methods in natural language processing; 2013; pp. 1631–1642. [Google Scholar]
- Yang, H.; Yue, S.; He, Y. Auto-GPT for Online Decision Making: Benchmarks and Additional Opinions. arXiv 2023, arXiv:2306.02224. [Google Scholar]
- Hayashi, H.; Budania, P.; Wang, P.; Ackerson, C.; Neervannan, R.; Neubig, G. WikiAsp: A dataset for multi-domain aspect-based summarization. Transactions of the Association for Computational Linguistics 2021, 9, 211–225. [Google Scholar] [CrossRef]
- Wang, S.; Xu, Y.; Fang, Y.; Liu, Y.; Sun, S.; Xu, R.; Zhu, C.; Zeng, M. Training data is more valuable than you think: A simple and effective method by retrieving from training data. arXiv 2022, arXiv:2203.08773. [Google Scholar]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, *!!! REPLACE !!!*; Polosukhin, I. Attention is all you need. Advances in neural information processing systems 2017, 30. [Google Scholar]
- Levesque, H.J. Foundations of a functional approach to knowledge representation. Artificial Intelligence 1984, 23, 155–212. [Google Scholar] [CrossRef]
- Ouyang, L.; Wu, J.; Jiang, X.; Almeida, D.; Wainwright, C.; Mishkin, P.; Zhang, C.; Agarwal, S.; Slama, K.; Ray, A.; et al. Training language models to follow instructions with human feedback. Advances in neural information processing systems 2022, 35, 27730–27744. [Google Scholar]
- Brown, T.; Mann, B.; Ryder, N.; Subbiah, M.; Kaplan, J.D.; Dhariwal, P.; Neelakantan, A.; Shyam, P.; Sastry, G.; Askell, A.; et al. Language models are few-shot learners. Advances in neural information processing systems 2020, 33, 1877–1901. [Google Scholar]
- Wen, T.H.; Gasic, M.; Mrksic, N.; Rojas-Barahona, L.M.; Su, P.H.; Ultes, S.; Vandyke, D.; Young, S. Conditional generation and snapshot learning in neural dialogue systems. arXiv 2016, arXiv:1606.03352. [Google Scholar]
- Wang, X.; Yang, Q.; Qiu, Y.; Liang, J.; He, Q.; Gu, Z.; Xiao, Y.; Wang, W. KnowledGPT: Enhancing Large Language Models with Retrieval and Storage Access on Knowledge Bases. arXiv 2023, arXiv:2308.11761. [Google Scholar]
- Chan, D.M.; Ghosh, S.; Rastrow, A.; Hoffmeister, B. Using External Off-Policy Speech-To-Text Mappings in Contextual End-To-End Automated Speech Recognition. arXiv 2023, arXiv:2301.02736. [Google Scholar]
- Dam, S.K.; Hong, C.S.; Qiao, Y.; Zhang, C. A Complete Survey on LLM-based AI Chatbots. 2024, arXiv:cs.CL/2406.16937. [Google Scholar]
- Gao, Y.; Xiong, Y.; Gao, X.; Jia, K.; Pan, J.; Bi, Y.; Dai, Y.; Sun, J.; Wang, M.; Wang, H. Retrieval-Augmented Generation for Large Language Models: A Survey. A: Generation for Large Language Models, 2024; arXiv:cs.CL/2312.10997. [Google Scholar]
- Kočiský, T.; Schwarz, J.; Blunsom, P.; Dyer, C.; Hermann, K.M.; Melis, G.; Grefenstette, E. The NarrativeQA Reading Comprehension Challenge 2017. 2017, arXiv:cs.CL/1712.07040. [Google Scholar]
- Wu, Q.; Bansal, G.; Zhang, J.; Wu, Y.; Li, B.; Zhu, E.; Jiang, L.; Zhang, X.; Zhang, S.; Liu, J.; et al. AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation Framework. 2023, arXiv:cs.AI/2308.08155. [Google Scholar]
- Husain, H.; Wu, H.H.; Gazit, T.; Allamanis, M.; Brockschmidt, M. Codesearchnet challenge: Evaluating the state of semantic code search. arXiv 2019, arXiv:1909.09436. [Google Scholar]
- Ren, Y.; Cao, Y.; Guo, P.; Fang, F.; Ma, W.; Lin, Z. Retrieve-and-sample: Document-level event argument extraction via hybrid retrieval augmentation. In Proceedings of the Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers); 2023; pp. 293–306. [Google Scholar]
- Li, J.; Li, D.; Savarese, S.; Hoi, S. Blip-2: Bootstrapping language-image pre-training with frozen image encoders and large language models. arXiv 2023, arXiv:2301.12597. [Google Scholar]
- Ravuru, C.; Sakhinana, S.S.; Runkana, V. Agentic Retrieval-Augmented Generation for Time Series Analysis. 2024, arXiv:cs.AI/2408.14484. [Google Scholar]
- Zniyed, Y.; Nguyen, T.P.; et al. Efficient tensor decomposition-based filter pruning. Neural Networks 2024, 178, 106393. [Google Scholar]
- Trivedi, H.; Balasubramanian, N.; Khot, T.; Sabharwal, A. MuSiQue: Multihop Questions via Single-hop Question Composition. Transactions of the Association for Computational Linguistics 2022, 10, 539–554. [Google Scholar] [CrossRef]
- Li, X.; Nie, E.; Liang, S. From Classification to Generation: Insights into Crosslingual Retrieval Augmented ICL. arXiv 2023, arXiv:2311.06595. [Google Scholar]
- Wang, L.; Yang, N.; Wei, F. Query2doc: Query Expansion with Large Language Models. arXiv 2023, arXiv:2303.07678. [Google Scholar]
- He, R.; McAuley, J. Ups and Downs: Modeling the Visual Evolution of Fashion Trends with One-Class Collaborative Filtering. In Proceedings of the Proceedings of the 25th International Conference on World Wide Web, Republic and Canton of Geneva, CHE, 2016. [CrossRef]
- Sarthi, P.; Abdullah, S.; Tuli, A.; Khanna, S.; Goldie, A.; Manning, C.D. RAPTOR: Recursive Abstractive Processing for Tree-Organized Retrieval. arXiv 2024, arXiv:2401.18059. [Google Scholar]
- Wang, X.; Chen, G.H.; Song, D.; Zhang, Z.; Chen, Z.; Xiao, Q.; Jiang, F.; Li, J.; Wan, X.; Wang, B.; et al. CMB: A Comprehensive Medical Benchmark in Chinese. 2024, arXiv:cs.CL/2308.08833. [Google Scholar]
- Bisk, Y.; Zellers, R.; Gao, J.; Choi, Y.; et al. Piqa: Reasoning about physical commonsense in natural language. In Proceedings of the Proceedings of the AAAI conference on artificial intelligence; 2020; 34, pp. 7432–7439. [Google Scholar]
- Li, X.; Li, J. AnglE-optimized Text Embeddings. arXiv 2023, arXiv:2309.12871. [Google Scholar]
- NVIDIA. Spectrum-X: End-to-End Networking for AI and High-Performance Computing. https://www.nvidia.com/en-us/networking/spectrumx/, 2025. Accessed: 2025-01-28.
- Lebret, R.; Grangier, D.; Auli, M. Neural text generation from structured data with application to the biography domain. arXiv 2016, arXiv:1603.07771. [Google Scholar]
- Baek, J.; Jeong, S.; Kang, M.; Park, J.C.; Hwang, S.J. Knowledge-Augmented Language Model Verification. arXiv 2023, arXiv:2310.12836. [Google Scholar]
- Zniyed, Y.; Nguyen, T.P.; et al. Enhanced network compression through tensor decompositions and pruning. IEEE Transactions on Neural Networks and Learning Systems 2024. [Google Scholar]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
