Preprint Article Version 2 Preserved in Portico This version is not peer-reviewed

Arabic Chatbot Evaluation Based on Extractive Question-Answering Transfer Learning and Language Transformers

Version 1 : Received: 7 July 2023 / Approved: 10 July 2023 / Online: 10 July 2023 (11:05:08 CEST)
Version 2 : Received: 10 July 2023 / Approved: 11 July 2023 / Online: 11 July 2023 (09:59:41 CEST)

A peer-reviewed article of this Preprint also exists.

Alruqi, T.N.; Alzahrani, S.M. Evaluation of an Arabic Chatbot Based on Extractive Question-Answering Transfer Learning and Language Transformers. AI 2023, 4, 667-691. Alruqi, T.N.; Alzahrani, S.M. Evaluation of an Arabic Chatbot Based on Extractive Question-Answering Transfer Learning and Language Transformers. AI 2023, 4, 667-691.

Abstract

Chatbots are computer programs that use artificial intelligence to imitate human conversations. Recent advancements in deep learning have shown interest in utilizing language transformers, which do not rely on predefined rules and responses like traditional chatbots. This study provides a comprehensive review of previous research on chatbots that employ deep learning and transfer learning models. Specifically, it examines the current trends in using language transformers with transfer learning techniques to evaluate the ability of Arabic chatbots to understand conversation context and demonstrate natural behavior. The proposed methods explore the use of AraBERT, CAMeLBERT, AraElectra-SQuAD, and AraElectra (Generator/Discriminator) transformers, with different variants of these transformers and semantic embedding models. Two datasets were used for evaluation: one with 398 questions and corresponding documents, and another with 1395 questions and 365,568 documents sourced from Arabic Wikipedia. Extensive experimental works were conducted, evaluating both manually crafted questions and the entire set of questions, using confidence and similarity metrics. The experimental results showed that the AraElectra-SQuAD model achieved an average confidence score of 0.6422 and an average similarity score of 0.9773 on the first dataset, and an average confidence score of 0.6658 and similarity score of 0.9660 on the second dataset. The study concludes that the AraElectra-SQuAD model consistently outperformed other models, displaying remarkable performance, high confidence, and similarity scores, as well as robustness, highlighting its potential for practical applications in natural language processing tasks for Arabic chatbots. The study suggests that the AraElectra-SQuAD model can be further enhanced and applied in various tasks such as chatbots, virtual assistants, and information retrieval systems for Arabic-speaking users. By combining the power of transformer architecture with fine-tuning on SQuAD-like large data, this trend demonstrates its ability to provide accurate and contextually relevant answers to questions in Arabic.

Keywords

Arabic; chatbot; transfer learning; AraBERT; CAMeLBERT; AarElectra (Generator/Discriminator); AraElectra-SQuAD

Subject

Computer Science and Mathematics, Artificial Intelligence and Machine Learning

Comments (1)

Comment 1
Received: 11 July 2023
Commenter: Salha Alzahrani
Commenter's Conflict of Interests: Author
Comment: We deleted figure 1 due to copyrights issues
+ Respond to this comment

We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.

Leave a public comment
Send a private comment to the author(s)
* All users must log in before leaving a comment
Views 0
Downloads 0
Comments 1
Metrics 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.