Preprint Article Version 2 Preserved in Portico This version is not peer-reviewed

When to Use Large Language Model: Upper Bound Analysis of BM25 Algorithms in Reading Comprehension Task

Version 1 : Received: 10 January 2023 / Approved: 12 January 2023 / Online: 12 January 2023 (08:58:03 CET)
Version 2 : Received: 29 March 2023 / Approved: 30 March 2023 / Online: 30 March 2023 (03:51:37 CEST)

A peer-reviewed article of this Preprint also exists.

T. Liu, Q. Xiong and S. Zhang, "When to Use Large Language Model: Upper Bound Analysis of BM25 Algorithms in Reading Comprehension Task," 2023 5th International Conference on Natural Language Processing (ICNLP), Guangzhou, China, 2023, pp. 1-4, doi: 10.1109/ICNLP58431.2023.00049. T. Liu, Q. Xiong and S. Zhang, "When to Use Large Language Model: Upper Bound Analysis of BM25 Algorithms in Reading Comprehension Task," 2023 5th International Conference on Natural Language Processing (ICNLP), Guangzhou, China, 2023, pp. 1-4, doi: 10.1109/ICNLP58431.2023.00049.

Abstract

Large language model (LLM) is a representation of a major advancement in AI, and has been used in multiple natural language processing tasks. Nevertheless, in different business scenarios, LLM requires fine-tuning by engineers to achieve satisfactory performance, and the cost of achieving target performance and fine-tuning may not match. Based on the Baidu STI dataset, we study the upper bound of the performance that classical information retrieval methods can achieve under a specific business, and compare it with the cost and performance of the participating team based on LLM. This paper gives an insight into the potential of classical computational linguistics algorithms, and which can help decision-makers make reasonable choices for LLM and low-cost methods in business R&D.

Keywords

Large Language Model; natural language processing; reading comprehension; computational lin-guistics; information retrieval; BM25

Subject

Computer Science and Mathematics, Data Structures, Algorithms and Complexity

Comments (1)

Comment 1
Received: 30 March 2023
Commenter: Tingzhen Liu
Commenter's Conflict of Interests: Author
Comment: 1. Highlights in the document
2. Adjusted the narrative order of Chapter 4
+ Respond to this comment

We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.

Leave a public comment
Send a private comment to the author(s)
* All users must log in before leaving a comment
Views 0
Downloads 0
Comments 1


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.