Preprint Article Version 1 Preserved in Portico This version is not peer-reviewed

A Comparative Study of Large Language Models in Explaining Intrinsically Disordered Proteins

Version 1 : Received: 13 August 2023 / Approved: 14 August 2023 / Online: 14 August 2023 (09:44:19 CEST)

How to cite: Gonzalez, D.T.; Djulbegovic, M.; Kim, C.; Antonietti, M.; Gameiro, G.R.; Uversky, V.N. A Comparative Study of Large Language Models in Explaining Intrinsically Disordered Proteins. Preprints 2023, 2023081014. https://doi.org/10.20944/preprints202308.1014.v1 Gonzalez, D.T.; Djulbegovic, M.; Kim, C.; Antonietti, M.; Gameiro, G.R.; Uversky, V.N. A Comparative Study of Large Language Models in Explaining Intrinsically Disordered Proteins. Preprints 2023, 2023081014. https://doi.org/10.20944/preprints202308.1014.v1

Abstract

(1) Background: Artificial Intelligence (AI) models have shown potential in various educational contexts. However, their utility in explaining complex biological phenomena, such as Intrinsically Disordered Proteins (IDPs), requires further exploration. This study empirically evaluated the performance of various Large Language Models (LLMs) in the educational domain of IDPs. (2) Methods: Four LLMs, GPT-3.5, GPT-4, GPT-4 with Browsing, and Google Bard (PaLM 2), were assessed using a set of IDP-related questions. An expert evaluated their responses across five categories: accuracy, relevance, depth of understanding, clarity, and overall quality. Descriptive statistics, ANOVA, and Tukey's honesty significant difference tests were utilized for analysis. (3) Results: The GPT-4 model consistently outperformed the others across all evaluation categories. Although GPT-4 and GPT-3.5 were not statistically significantly different in performance (p>0.05), GPT-4 was preferred as the best response in 13 out of 15 instances. The AI models with browsing capabilities, GPT-4 with Browsing and Google Bard (PaLM 2) displayed lower performance metrics across the board with statistically significant differences (p<0.0001). (4) Conclusion: Our findings underscore the potential of AI models, particularly LLMs such as GPT-4, in enhancing scientific education, especially in complex domains such as IDPs. Continued innovation and collaboration among AI developers, educators, and researchers are essential to fully harness the potential of AI for enriching scientific education.

Keywords

Artificial Intelligence; Generative Pre-Trained Transformers (GPT); Intrinsically Disordered Proteins (IDPs); Large Language Models (LLMs); Pathways Language Model 2 (PaLM 2)

Subject

Computer Science and Mathematics, Artificial Intelligence and Machine Learning

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.

Leave a public comment
Send a private comment to the author(s)
* All users must log in before leaving a comment
Views 0
Downloads 0
Comments 0
Metrics 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.