Submitted:
24 February 2025
Posted:
26 February 2025
You are already at the latest version
Abstract
This paper addresses the challenge of ambiguous and poorly formulated user queries in Retrieval-Augmented Generation (RAG) based conversational systems. Current RAG systems often struggle to provide satisfactory responses to such queries, hindering user experience. To mitigate this issue, we propose a novel approach for suggestion question generation that moves beyond traditional retrieval-based methods. Our method leverages the inherent knowledge and generative capabilities of Large Language Models (LLMs) to directly generate relevant and helpful suggestion questions, without explicit document retrieval during inference. We train our models on a dedicated dataset of user queries and curated suggestion questions using a supervised learning strategy. Extensive experiments, comparing our approach against zero-shot, few-shot, and RAG-based baselines, demonstrate the superior performance of our LLM-driven method in terms of correctness, relevance, and helpfulness, further validated by human evaluations. Ablation studies and error analysis provide deeper insights into the effectiveness and limitations of our approach. The results highlight the potential of purely generative models for user query refinement and suggest a paradigm shift in suggestion question generation for conversational AI.
Keywords:
1. Introduction
- We propose a novel and effective approach for suggestion question generation in conversational systems that completely eliminates the reliance on external document retrieval, instead leveraging the inherent knowledge and generative capabilities of Large Language Models.
- We introduce a dedicated training methodology and dataset for fine-tuning LLMs to directly generate high-quality suggestion questions, providing a valuable resource for future research in this area.
- Through comprehensive experiments and evaluations, we demonstrate the superior performance of our purely LLM-driven approach compared to traditional RAG-based methods and baselines, highlighting the potential of this paradigm shift in conversational AI and user query refinement.
2. Related Work
2.1. Generating Suggestion Questions
2.2. Large Language Models
3. Method
3.1. Model Architecture and Task Formulation
3.2. Learning Strategy Details
4. Experiments
4.1. Experimental Setup
4.1.1. Datasets
4.1.2. Baselines
- Zero-shot LLM: We directly prompted a pre-trained LLM (without fine-tuning) to generate suggestion questions given the user query. This baseline assesses the inherent zero-shot capability of pre-trained models for this task.
- Few-shot LLM with Example Prompting: We prompted a pre-trained LLM with a few hand-crafted examples of query-suggestion question pairs in the input prompt before generating suggestions for new queries. This baseline evaluates the effectiveness of few-shot in-context learning.
- RAG-based Suggestion Generation with Retrieved Documents: We implemented a traditional Retrieval-Augmented Generation (RAG) system adapted for suggestion question generation. This system retrieves relevant documents using a standard retrieval model (e.g., BM25) based on the user query and then uses a separate Transformer-based generation model to generate suggestion questions conditioned on the retrieved documents and the query. This baseline represents a strong traditional retrieval-based approach.
4.1.3. Evaluation Metrics
4.2. Comparative Results
4.3. Ablation Study and Further Analysis
4.4. Human Evaluation
4.5. Error Analysis
4.6. Qualitative Examples
5. Conclusions
References
- Huang, Y.; Huang, J. A Survey on Retrieval-Augmented Text Generation for Large Language Models. CoRR 2024, abs/2404.10981, [2404.10981]. [CrossRef]
- Zhou, Y.; Rao, Z.; Wan, J.; Shen, J. Rethinking Visual Dependency in Long-Context Reasoning for Large Vision-Language Models. arXiv preprint arXiv:2410.19732 2024.
- Zhou, Y.; Li, X.; Wang, Q.; Shen, J. Visual In-Context Learning for Large Vision-Language Models. In Proceedings of the Findings of the Association for Computational Linguistics, ACL 2024, Bangkok, Thailand and virtual meeting, August 11-16, 2024. Association for Computational Linguistics, 2024, pp. 15890–15902.
- Tayal, A.; Tyagi, A. Dynamic Contexts for Generating Suggestion Questions in RAG Based Conversational Systems. In Proceedings of the Companion Proceedings of the ACM on Web Conference 2024, WWW 2024, Singapore, Singapore, May 13-17, 2024; Chua, T.; Ngo, C.; Lee, R.K.; Kumar, R.; Lauw, H.W., Eds. ACM, 2024, pp. 1338–1341. [CrossRef]
- Zeng, H.; Wei, B.; Liu, J.; Fu, W. Synthesize, prompt and transfer: Zero-shot conversational question generation with pre-trained language model. In Proceedings of the Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023, pp. 8989–9010.
- Wang, Y.; Liu, C.; Huang, M.; Nie, L. Learning to ask questions in open-domain conversational systems with typed decoders. arXiv preprint arXiv:1805.04843 2018.
- Zhou, Y.; Geng, X.; Shen, T.; Tao, C.; Long, G.; Lou, J.G.; Shen, J. Thread of thought unraveling chaotic contexts. arXiv preprint arXiv:2311.08734 2023.
- Steuer, T.; Filighera, A.; Tregel, T.; Miede, A. Educational Automatic Question Generation Improves Reading Comprehension in Non-native Speakers: A Learner-Centric Case Study. Frontiers Artif. Intell. 2022, 5, 900304. [CrossRef]
- Shakurnia, A.; Aslami, M.; Bijanzadeh, M. The effect of question generation activity on students’ learning and perception. Journal of Advances in Medical Education & Professionalism 2018, 6, 70.
- Zhou, Y.; Geng, X.; Shen, T.; Zhang, W.; Jiang, D. Improving zero-shot cross-lingual transfer for multilingual question answering over knowledge graph. In Proceedings of the Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2021, pp. 5822–5834.
- Xie, Y.; Pan, L.; Wang, D.; Kan, M.; Feng, Y. Exploring Question-Specific Rewards for Generating Deep Questions. In Proceedings of the Proceedings of the 28th International Conference on Computational Linguistics, COLING 2020, Barcelona, Spain (Online), December 8-13, 2020; Scott, D.; Bel, N.; Zong, C., Eds. International Committee on Computational Linguistics, 2020, pp. 2534–2546. [CrossRef]
- Zhou, Y.; Shen, T.; Geng, X.; Tao, C.; Shen, J.; Long, G.; Xu, C.; Jiang, D. Fine-grained distillation for long document retrieval. In Proceedings of the Proceedings of the AAAI Conference on Artificial Intelligence, 2024, Vol. 38, pp. 19732–19740.
- Wang, Z.; Li, M.; Xu, R.; Zhou, L.; Lei, J.; Lin, X.; Wang, S.; Yang, Z.; Zhu, C.; Hoiem, D.; et al. Language Models with Image Descriptors are Strong Few-Shot Video-Language Learners. In Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, NeurIPS 2022, New Orleans, LA, USA, November 28 - December 9, 2022; Koyejo, S.; Mohamed, S.; Agarwal, A.; Belgrave, D.; Cho, K.; Oh, A., Eds., 2022.
- Devlin, J.; Chang, M.; Lee, K.; Toutanova, K. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2-7, 2019, Volume 1 (Long and Short Papers); Burstein, J.; Doran, C.; Solorio, T., Eds. Association for Computational Linguistics, 2019, pp. 4171–4186. [CrossRef]
- Zhang, X.; Yang, H.; Young, E.F.Y. Attentional Transfer is All You Need: Technology-aware Layout Pattern Generation. In Proceedings of the 58th ACM/IEEE Design Automation Conference, DAC 2021, San Francisco, CA, USA, December 5-9, 2021. IEEE, 2021, pp. 169–174. [CrossRef]
- Liu, Y.; Ott, M.; Goyal, N.; Du, J.; Joshi, M.; Chen, D.; Levy, O.; Lewis, M.; Zettlemoyer, L.; Stoyanov, V. RoBERTa: A Robustly Optimized BERT Pretraining Approach. CoRR 2019, abs/1907.11692, [1907.11692].
- Kaplan, J.; McCandlish, S.; Henighan, T.; Brown, T.B.; Chess, B.; Child, R.; Gray, S.; Radford, A.; Wu, J.; Amodei, D. Scaling Laws for Neural Language Models. CoRR 2020, abs/2001.08361, [2001.08361].
- Dai, Z.; Yang, Z.; Yang, Y.; Carbonell, J.G.; Le, Q.V.; Salakhutdinov, R. Transformer-XL: Attentive Language Models beyond a Fixed-Length Context. In Proceedings of the Proceedings of the 57th Conference of the Association for Computational Linguistics, ACL 2019, Florence, Italy, July 28- August 2, 2019, Volume 1: Long Papers; Korhonen, A.; Traum, D.R.; Màrquez, L., Eds. Association for Computational Linguistics, 2019, pp. 2978–2988. [CrossRef]
- Raffel, C.; Shazeer, N.; Roberts, A.; Lee, K.; Narang, S.; Matena, M.; Zhou, Y.; Li, W.; Liu, P.J. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. J. Mach. Learn. Res. 2020, 21, 140:1–140:67.
- Lee, J. InstructPatentGPT: Training patent language models to follow instructions with human feedback. CoRR 2024, abs/2406.16897, [2406.16897]. [CrossRef]
- Chowdhery, A.; Narang, S.; Devlin, J.; Bosma, M.; Mishra, G.; Roberts, A.; Barham, P.; Chung, H.W.; Sutton, C.; Gehrmann, S.; et al. PaLM: Scaling Language Modeling with Pathways. J. Mach. Learn. Res. 2023, 24, 240:1–240:113.
- Zhou, Y.; Long, G. Style-Aware Contrastive Learning for Multi-Style Image Captioning. In Proceedings of the Findings of the Association for Computational Linguistics: EACL 2023, 2023, pp. 2257–2267.
- Zhou, Y.; Tao, W.; Zhang, W. Triple sequence generative adversarial nets for unsupervised image captioning. In Proceedings of the ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2021, pp. 7598–7602.
| Method | Correctness (%) | Relevance (%) | Helpfulness (%) |
|---|---|---|---|
| Zero-shot LLM | 75.2 | 68.5 | 62.1 |
| Few-shot LLM with Example Prompting | 78.9 | 72.3 | 65.8 |
| RAG-based Suggestion Generation with Retrieved Documents | 82.5 | 75.9 | 69.5 |
| Our Approach (LLM-driven) | 88.7 | 81.2 | 74.9 |
| Method | Correctness (%) | Relevance (%) | Helpfulness (%) |
|---|---|---|---|
| Our Approach (LLM-driven - Fine-tuned) | 88.7 | 81.2 | 74.9 |
| Our Approach (LLM-driven - Zero-shot Prompt) | 80.1 | 73.5 | 67.2 |
| Preference for Our Approach vs. Baseline | Preference Rate |
|---|---|
| vs. Zero-shot LLM | 78.5% |
| vs. Few-shot LLM with Example Prompting | 72.1% |
| vs. RAG-based Suggestion Generation with Retrieved Documents | 65.3% |
| Error Type | Percentage of Errors |
|---|---|
| Incorrect Grammar/Fluency | 15.2% |
| Irrelevant to User Query | 28.7% |
| Not Helpful for Query Refinement | 35.1% |
| Redundant/Repetitive Suggestions | 12.5% |
| Too Generic/Lack Specificity | 8.5% |
| User Query | Our Approach (LLM-driven) | RAG-based Suggestion Generation with Retrieved Documents | Few-shot LLM with Example Prompting |
|---|---|---|---|
| baby sleep | 1. What are some common baby sleep problems? | 1. What are the benefits of baby sleep? | 1. How to get a baby to sleep through the night? |
| 2. How can I improve my baby’s sleep? | 2. What are the risks of poor baby sleep? | 2. What is a good baby sleep schedule? | |
| coffee shop near me | 1. Are you looking for a coffee shop with wifi? | 1. What are the opening hours of coffee shops near you? | 1. What are the best coffee shops in this city? |
| 2. Do you prefer coffee shops with outdoor seating? | 2. What is the price range of coffee shops nearby? | 2. Show me directions to the nearest coffee shop. | |
| translate to Spanish | 1. What text do you want to translate? | 1. What are the different dialects of Spanish? | 1. Translate "hello world" to Spanish. |
| 2. Do you want to translate a phrase or a sentence? | 2. What is the history of the Spanish language? | 2. Translate this sentence into Spanish for me. |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).