ARTICLE | doi:10.20944/preprints202301.0219.v2
Subject: Computer Science And Mathematics, Data Structures, Algorithms And Complexity Keywords: Large Language Model; natural language processing; reading comprehension; computational lin-guistics; information retrieval; BM25
Online: 30 March 2023 (03:51:37 CEST)
Large language model (LLM) is a representation of a major advancement in AI, and has been used in multiple natural language processing tasks. Nevertheless, in different business scenarios, LLM requires fine-tuning by engineers to achieve satisfactory performance, and the cost of achieving target performance and fine-tuning may not match. Based on the Baidu STI dataset, we study the upper bound of the performance that classical information retrieval methods can achieve under a specific business, and compare it with the cost and performance of the participating team based on LLM. This paper gives an insight into the potential of classical computational linguistics algorithms, and which can help decision-makers make reasonable choices for LLM and low-cost methods in business R&D.
ARTICLE | doi:10.20944/preprints202301.0402.v1
Subject: Computer Science And Mathematics, Mathematics Keywords: Sequence Encoder; Autoregressive Sequence; Separated Model; Statistical Test; Neural Network
Online: 23 January 2023 (08:30:48 CET)
While the language model using the stop sign as an independent token has been widely used to decide when the model should stop, it may lead to the growth of vocabulary dimensions and further problems. Similarly, present research on game algorithms usually estimate stopping point related problems based on the evaluation of the winning rate. However, information redundancy may also exist in such models, thus increasing the training difficulty. Above two types of tasks (and similar autoregressive tasks) show a common problem of stopping point prediction. In this paper, we describe a design of separated model, trying to separate the complexity of stopping point prediction from the main task model, so that the information used for estimating stopping point can be reduced. On this basis, in order to verify the rationality of using separated model, we propose a model-free test method. It judges the separability of transformed data based on point difference and sequence difference metrics. In this way, it can predict the credibility of the separated model inference.