Submitted:
18 March 2025
Posted:
19 March 2025
You are already at the latest version
Abstract
Keywords:
1. Introduction
- We propose a novel LLM-Driven Conversational Search Session Synthesis (LLM-CSSS) method that leverages the generative capabilities of large language models to directly create synthetic conversational search data.
- We develop a training framework that enables the LLM to generate realistic and relevant multi-turn dialogues by interacting with a simulated search environment based on the Amazon Review dataset.
- We demonstrate through comprehensive experiments that conversational search models trained on the synthetic data generated by our LLM-CSSS method achieve significant performance improvements, highlighting the effectiveness of our approach in addressing the data scarcity challenge.
2. Related Work
2.1. Conversational Search
2.2. Large Language Models
3. Method
3.1. Model Architecture
3.2. Learning Strategy
4. Experiments
4.1. Experimental Setup
- Original Amazon Review Data (Adapted for Conversational Search): This baseline uses the Amazon Review dataset where user reviews are treated as user turns and corresponding product information serves as system turns, forming a sequence of interactions.
- ConvSDG: Conversational Session Data Generation via User-Item Interaction Sequences: This method, proposed in prior work, generates synthetic conversational sessions by transforming user-item interaction sequences into dialogues.
- Randomly Generated Conversations via Simple Language Model: This baseline generates conversational turns randomly using a basic n-gram language model trained on the Amazon Review dataset, without any specific structure or relevance constraints.
4.2. Main Results
4.3. Analysis of Generated Data
4.4. Human Evaluation
4.5. Impact of Simulated Search Environment
4.6. Analysis of Conversation Turn Length
4.7. Performance Across Different Product Categories
4.8. Impact of Reinforcement Learning Stage
5. Conclusion
References
- Mo, F.; Mao, K.; Zhao, Z.; Qian, H.; Chen, H.; Cheng, Y.; Li, X.; Zhu, Y.; Dou, Z.; Nie, J. A Survey of Conversational Search. CoRR, 2024, abs/2410.15576, [2410.15576]. [CrossRef]
- Soudani, H.; Petcu, R.; Kanoulas, E.; Hasibi, F. Data Augmentation for Conversational AI. In Proceedings of the Companion Proceedings of the ACM on Web Conference 2024, WWW 2024, Singapore, Singapore, May 13-17, 2024; Chua,T.;Ngo,C.;Lee,R.K.;Kumar,R.;Lauw,H.W.,Eds.ACM,2024,pp.1234–1237. [CrossRef]
- Zhou, Y.; Geng, X.; Shen, T.; Tao, C.; Long, G.; Lou, J.G.; Shen, J. Thread of thought unraveling chaotic contexts. arXiv preprint, arXiv:2311.08734 2023.
- Huang, C.; Hsu, C.; Hsu, T.; Li, C.; Chen, Y. CONVERSER: Few-shot Conversational Dense Retrieval with Synthetic Data Generation. In Proceedings of the Proceedings of the 24th Meeting of the Special Interest Group on Discourse and Dialogue, SIGDIAL 2023, Prague, Czechia, September 11 - 15, 2023; Schlangen, D.; Stoyanchev,S.;Joty,S.;Dusek,O.;Kennington,C.;Alikhani,M.,Eds.AssociationforComputational Linguistics, 2023,pp.381–387. [CrossRef]
- Mo, F.; Yi, B.; Mao, K.; Qu, C.; Huang, K.; Nie, J. ConvSDG: Session Data Generation for Conversational Search. In Proceedings of the Companion Proceedings of the ACM on Web Conference 2024, WWW 2024, Singapore, Singapore, May 13-17, 2024; Chua,T.;Ngo,C.;Lee,R.K.;Kumar,R.;Lauw,H.W.,Eds.ACM,2024, pp. 1634–1642. [CrossRef]
- Zhang, X.; Yang, H.; Young, E.F.Y. Attentional Transfer is All You Need: Technology-aware Layout Pattern Generation. In Proceedings of the 58th ACM/IEEE Design Automation Conference, DAC 2021, San Francisco, CA, USA, December5-9 2021. IEEE, 2021; pp. 169–174. [CrossRef]
- Zhou, Y.; Li, X.; Wang, Q.; Shen, J. Visual In-Context Learning for Large Vision-Language Models. In Proceedings of the Findings of the Association for Computational Linguistics, ACL 2024, Bangkok, Thailand and virtual meeting, August11-16, 2024. Association for Computational Linguistics, 2024; pp. 15890–15902.
- Zhou, Y.; Zhang, J.; Chen, G.; Shen, J.; Cheng, Y. Less Is More: Vision Representation Compression for Efficient Video Generation with Large Language Models, 2024.
- Zhou, Y.; Song, L.; Shen, J. Training Medical Large Vision-Language Models with Abnormal-Aware Feedback. arXiv preprint arXiv:2501.01377, 2025.
- Zhou, Y.; Long, G. Style-Aware Contrastive Learning for Multi-Style Image Captioning. In Proceedings of the Findings of the Association for Computational Linguistics: EACL 2023, 2023, pp. 2257–2267. [Google Scholar]
- McAuley, J.J.; Leskovec, J. From amateurs to connoisseurs: modeling the evolution of user expertise through online reviews. In Proceedings of the Proceedings of the 22nd international conference on World Wide Web, 2013, pp.897–908.
- Repplinger, J. G.G. Chowdhury. Introduction to Modern Information Retrieval. 3rd ed. London: Facet, 2010. 508p. alk. paper, $90 (ISBN 9781555707156). LC2010-013746. Coll. Res. Libr. 2011, 72, 194–195. [Google Scholar]
- Zhang, Y.; Chen, X.; Ai, Q.; Yang, L.; Croft, W.B. Towards conversational search and recommendation: System ask, user respond. In Proceedings of the Proceedings of the 27th acm international conference on information and knowledge management, 2018, pp. 177–186.
- Qu, C.; Yang, L.; Croft, W.B.; Zhang, Y.; Trippas, J.R.; Qiu, M. User intent prediction in information-seeking conversations. In Proceedings of the Proceedings of the 2019 Conference on Human Information Interaction and Retrieval, 2019, pp.25–33.
- Zhou, Y.; Geng, X.; Shen, T.; Long, G.; Jiang, D. Eventbert: A pre-trained model for event correlation reasoning. In Proceedings of the Proceedings of the ACM Web Conference 2022, 2022, pp. 850–859. [Google Scholar]
- Al-Thani, H.; Elsayed, T.; Jansen, B.J. Improving conversational search with query reformulation using selective contextual history. Data and Information Management 2023, 7, 100025. [Google Scholar] [CrossRef]
- Zhou, Y.; Geng, X.; Shen, T.; Pei, J.; Zhang, W.; Jiang, D. Modeling event-pair relations in external knowledge graphs for script reasoning. Findings of the Association for Computational Linguistics: ACL-IJCNLP2021, 2021.
- Dixit, T.; Paranjape, B.; Hajishirzi, H.; Zettlemoyer, L. CORE: A retrieve-then-edit framework for counterfactual data generation. arXiv preprint arXiv:2210.04873, 2022.
- He, S.; Zhang, S.; Zhang, X.; Feng, Z. Improve conversational search with multi-document information. In Proceedings of the International Conference on Neural Information Processing. Springer; 2023; pp. 3–15. [Google Scholar]
- Wang, L.; Zhao, M.; Ji, H.; Jiang, Z.; Li, R.; Hu, Z.; Lu, X. Dialogue summarization enhanced response generation for multi-domain task-oriented dialogue systems. Inf. Process. Manag. 2024, 61, 103668. [Google Scholar] [CrossRef]
- Devlin, J.; Chang, M.; Lee, K.; Toutanova, K. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2-7, 2019, Volume 1 (Long and Short Papers); Burstein,J.; Doran,C.;Solorio,T.,Eds.AssociationforComputationalLinguistics,2019,pp.4171–4186. [CrossRef]
- Radford, A.; Wu, J.; Child, R.; Luan, D.; Amodei, D.; Sutskever, I.; et al. Language models are unsupervised multitask learners. OpenAI blog 2019, 1, 9. [Google Scholar]
- Liu, Y.; Ott, M.; Goyal, N.; Du, J.; Joshi, M.; Chen, D.; Levy, O.; Lewis, M.; Zettlemoyer, L.; Stoyanov, V. RoBERTa: A Robustly Optimized BERT Pretraining Approach. CoRR, 2019, abs/1907.11692, [1907.11692].
- Raffel, C.; Shazeer, N.; Roberts, A.; Lee, K.; Narang, S.; Matena, M.; Zhou, Y.; Li, W.; Liu, P.J. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. J. Mach. Learn. Res. 2020. 21, 140:1–140:67. [Google Scholar]
- Clark, K.; Luong, M.; Le, Q.V.; Manning, C.D. ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators. In Proceedings of the 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, April26-30,2020. OpenReview.net, 2020.
- Thoppilan, R.; Freitas, D.D.; Hall, J.; Shazeer, N.; Kulshreshtha, A.; Cheng, H.; Jin, A.; Bos, T.; Baker, L.; Du, Y.; et al. LaMDA: Language Models for Dialog Applications. CoRR, 2022, abs/2201.08239, [2201.08239].
- Chowdhery, A.; Narang, S.; Devlin, J.; Bosma, M.; Mishra, G.; Roberts, A.; Barham, P.; Chung, H.W.; Sutton, C.; Gehrmann, S.; et al. PaLM: Scaling Language Modeling with Pathways. J. Mach. Learn. Res. 2023, 24, 240:1–240:113.
| Model Category - Data Source | MAP@10 | NDCG@10 |
|---|---|---|
| Retrieval-Based - Original Amazon Review Data | 0.25 | 0.38 |
| Retrieval-Based - ConvSDG | 0.28 | 0.42 |
| Retrieval-Based - Randomly Generated Conversations | 0.15 | 0.22 |
| Retrieval-Based - LLM-CSSS (Ours) | 0.32 | 0.47 |
| Model Category - Data Source | BLEU | METEOR |
| Generation-Based - Original Amazon Review Data | 0.22 | 0.35 |
| Generation-Based - ConvSDG | 0.25 | 0.39 |
| Generation-Based - Randomly Generated Conversations | 0.10 | 0.18 |
| Generation-Based - LLM-CSSS (Ours) | 0.29 | 0.43 |
| Method | Coherence | Relevance | Informativeness | Overall Quality |
|---|---|---|---|---|
| Randomly Generated Conversations | 2.1 | 1.8 | 1.5 | 1.7 |
| ConvSDG | 3.5 | 3.8 | 3.6 | 3.7 |
| LLM-CSSS (Ours) | 4.2 | 4.5 | 4.3 | 4.4 |
| Data Source | MAP@10 | NDCG@10 |
|---|---|---|
| LLM-CSSS (with Search) | 0.32 | 0.47 |
| LLM-CSSS (without Search) | 0.29 | 0.43 |
| Method | Average Turns |
|---|---|
| Original Amazon Review Data | 2.1 |
| ConvSDG | 3.5 |
| Randomly Generated Conversations | 4.8 |
| LLM-CSSS (Ours) | 4.1 |
| Data Source | Electronics | Books | Clothing, Shoes & Jewelry |
|---|---|---|---|
| Original Amazon Review Data | 0.22 | 0.28 | 0.24 |
| LLM-CSSS (Ours) | 0.30 | 0.35 | 0.31 |
| Data Source | MAP@10 | NDCG@10 |
|---|---|---|
| LLM-CSSS (with RL) | 0.32 | 0.47 |
| LLM-CSSS (without RL) | 0.30 | 0.45 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).