Submitted:
16 May 2025
Posted:
16 May 2025
You are already at the latest version
Abstract
Keywords:
1. Introduction
2. Materials and Methods
2.1. Research Materials
2.2. Research Methods
- 1.
- Comparison of text accuracy;
- 2.
- Textual legibility comparison;
- 3.
- Textual comprehensibility comparison;
- 4.
- The study of the effect of text readability;
- 5.
- The semantic comparison of text content.
2.3. Statistical Method
3. Results
3.1. Comparison of Text Accuracy
- (1)
- The consistency of the text
- (2)
- Stability comparison
- (3)
- Extreme values
3.2. Text Readability Comparison
- (1)
- Comparison of text consistency
- (2)
- Stability comparison
- (3)
- Extreme comparison
3.3. Textual Comprehensibility Comparison
- (1)
- The overall comparison of the output level
- (2)
- Analysis of output stability
- (3)
- Extreme comparison
3.4. Text Readability Influence Factor Analysis
3.4.1. Neural Network Algorithm Construction
3.4.2. Influence Factor Analysis
3.5. Text Content Semantic Comparison
3.5.1. Lexical Frequency Statistics
3.5.2. Topic Mining
4. Results
5. Conclusions
6. Study Limitations
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Chakraborty, I.; Maity, P. COVID-19 outbreak: Migration, effects on society, global environment and prevention. Science of the total environment 2020, 728, 138882. [Google Scholar] [CrossRef] [PubMed]
- Jee, Y. WHO International Health Regulations Emergency Committee for the COVID-19 outbreak. Epidemiology Health Communication 2020, 42, e2020013. [Google Scholar] [CrossRef] [PubMed]
- Sarker, R.; Roknuzzaman, A.; Hossain, J.; Bhuiyan, M.A.; Islam, R. The WHO declares COVID-19 is no longer a public health emergency of international concern: Benefits, challenges, and necessary precautions to come back to normal life. Int. J. Surg. 2023, 109, 2851–2852. [Google Scholar] [CrossRef] [PubMed]
- Evans, A.; AlShurman, B.A.; Sehar, H.; Butt, Z.A. Monkeypox: A Mini-Review on the Globally Emerging Orthopoxvirus. Int. J. Environ. Res. Public Health 2022, 19, 15684. [Google Scholar] [CrossRef]
- Li, G.; Hilgenfeld, R.; Whitley, R.; De Clercq, E. Therapeutic strategies for COVID-19: progress and lessons learned. Nat. Rev. Drug Discov. 2023, 22, 449–475. [Google Scholar] [CrossRef]
- Baker, R.E.; Mahmud, A.S.; Miller, I.F.; Rajeev, M.; Rasambainarivo, F.; Rice, B.L.; Takahashi, S.; Tatem, A.J.; Wagner, C.E.; Wang, L.-F.; et al. Infectious disease in an era of global change. Nat. Rev. Microbiol. 2022, 20, 193–205. [Google Scholar] [CrossRef]
- Zhou, T.; Li, S. Understanding user switch of information seeking: From search engines to generative AI. J. Libr. Inf. Sci. 2024. [Google Scholar] [CrossRef]
- Bharti, I.; Chauhan, K.; Aggarwal, P. Generative AI: Next Frontier for Competitive Advantage. Enhancing Communication and Decision-Making With AI. IGI Global 2025, 1–36. [Google Scholar]
- Öztürk, Z.; Bal, C.; Çelikkaya, B.N. Evaluation of Information Provided by ChatGPT Versions on Traumatic Dental Injuries for Dental Students and Professionals. Dent. Traumatol. 2025, 1–10. [Google Scholar] [CrossRef]
- Siebielec, J.; Ordak, M.; Oskroba, A.; et al. Assessment Study of ChatGPT-3.5’s performance on the final Polish Medical examination: Accuracy in answering 980 questions. Healthcare 2024, 12, 1637. [Google Scholar] [CrossRef]
- Wang, G.; Gao, K.; Liu, Q.; Wu, Y.; Zhang, K.; Zhou, W.; Guo, C. Potential and Limitations of ChatGPT 3.5 and 4.0 as a Source of COVID-19 Information: Comprehensive Comparative Analysis of Generative and Authoritative Information. J. Med Internet Res. 2023, 25, e49771. [Google Scholar] [CrossRef] [PubMed]
- Zong, H.; Li, J.; Wu, E.; Wu, R.; Lu, J.; Shen, B. Performance of ChatGPT on Chinese national medical licensing examinations: a five-year examination evaluation study for physicians, pharmacists and nurses. BMC Med Educ. 2024, 24, 1–9. [Google Scholar] [CrossRef]
- Yanagita, Y.; Yokokawa, D.; Uchida, S.; Tawara, J.; Ikusaka, M. Accuracy of ChatGPT on Medical Questions in the National Medical Licensing Examination in Japan: Evaluation Study. JMIR Form. Res. 2023, 7, e48023. [Google Scholar] [CrossRef] [PubMed]
- Sadeq, M.A.; Ghorab, R.M.F.; Ashry, M.H.; Abozaid, A.M.; Banihani, H.A.; Salem, M.; Abu Aisheh, M.T.; Abuzahra, S.; Mourid, M.R.; Assker, M.M.; et al. AI chatbots show promise but limitations on UK medical exam questions: a comparative performance study. Sci. Rep. 2024, 14, 1–11. [Google Scholar] [CrossRef]
- Meo, S.A.; Al-Masri, A.A.; Alotaibi, M.; Meo, M.Z.S.; Meo, M.O.S. ChatGPT Knowledge Evaluation in Basic and Clinical Medical Sciences: Multiple Choice Question Examination-Based Performance. Healthcare 2023, 11, 2046. [Google Scholar] [CrossRef]
- Sumbal, A.; Sumbal, R.; Amir, A. Can ChatGPT-3.5 Pass a Medical Exam? A Systematic Review of ChatGPT’s Performance in Academic Testing. J. Med Educ. Curric. Dev. 2024, 11. [Google Scholar] [CrossRef] [PubMed]
- De Vito, A.; Geremia, N.; Marino, A.; Bavaro, D.F.; Caruana, G.; Meschiari, M.; Colpani, A.; Mazzitelli, M.; Scaglione, V.; Rullo, E.V.; et al. Assessing ChatGPT’s theoretical knowledge and prescriptive accuracy in bacterial infections: a comparative study with infectious diseases residents and specialists. Infection Disease Health Communication 2024, 1–9. [Google Scholar] [CrossRef]
- Onder, C.E.; Koc, G.; Gokbulut, P.; Taskaldiran, I.; Kuskonmaz, S.M. Evaluation of the reliability and readability of ChatGPT-4 responses regarding hypothyroidism during pregnancy. Sci. Rep. 2024, 14, 1–8. [Google Scholar] [CrossRef]
- Biswas, S.; Logan, N.S.; Davies, L.N.; Sheppard, A.L.; Wolffsohn, J.S. Assessing the utility of ChatGPT as an artificial intelligence-based large language model for information to answer questions on myopia. Ophthalmic Physiol. Opt. 2023, 43, 1562–1570. [Google Scholar] [CrossRef]
- Ahmed, W.M.; Azhari, A.A.; Alfaraj, A.; Alhamadani, A.; Zhang, M.; Lu, C.-T. The Quality of AI-Generated Dental Caries Multiple Choice Questions: A Comparative Analysis of ChatGPT and Google Bard Language Models. Heliyon 2024, 10, e28198. [Google Scholar] [CrossRef]
- Shen, S.A.; Perez-Heydrich, C.A.; Xie, D.X.; et al. ChatGPT vs. web search for patient questions: what does ChatGPT do better? Eur Arch Otorhinolaryngol 2024, 281, 3219–3225. [Google Scholar] [CrossRef] [PubMed]
- CONTROL C F D, PREVENTION, CONTROL C F D, et al. Clinical questions about COVID-19: questions and answers [EB/OL].(2020-8-4).
- Toprak, A.; Turan, M. Automated thematic dictionary creation using the web based on WordNet, Spacy, and Simhash. Data Inf. Manag. 2024. [Google Scholar] [CrossRef]
- SADOWSKI C, LEVIN G. Simhash: Hash-based similarity detection[EB/OL].(2007-12-13).
- Eleyan, D.; Othman, A.; Eleyan, A. Enhancing Software Comments Readability Using Flesch Reading Ease Score. Information Development 2020, 11, 430. [Google Scholar] [CrossRef]
- Grabeel, K.L.; Russomanno, J.; Oelschlegel, S.; Tester, E.; Heidel, R.E. Computerized versus hand-scored health literacy tools: a comparison of Simple Measure of Gobbledygook (SMOG) and Flesch-Kincaid in printed patient education materials. J. Med Libr. Assoc. 2018, 106, 38–45. [Google Scholar] [CrossRef] [PubMed]
- Karnan, N.; Francis, J.; Vijayvargiya, I.; Tan, C.R.; Rubino, C. Analyzing the Effectiveness of AI-Generated Patient Education Materials: A Comparative Study of ChatGPT and Google Gemini. Cureus 2024, 16. [Google Scholar] [CrossRef]
- Onan, A.; Korukoglu, S.; Bulut, H. LDA-based topic modelling in text sentiment classification: An empirical analysis. Int J Comput Linguistics Appl 2016, 7, 101–119. [Google Scholar]
- Desai, M.; Shah, M. An anatomization on breast cancer detection and diagnosis employing multi-layer perceptron neural network (MLP) and Convolutional neural network (CNN). Clin. eHealth 2021, 4, 1–11. [Google Scholar] [CrossRef]
- He, X.; Chen, Y. Modifications of the Multi-Layer Perceptron for Hyperspectral Image Classification. Remote. Sens. 2021, 13, 3547. [Google Scholar] [CrossRef]
- Yilmaz, I.; Kaynar, O. Multiple regression, ANN (RBF, MLP) and ANFIS models for prediction of swell potential of clayey soils. Expert Syst. Appl. 2011, 38, 5958–5966. [Google Scholar] [CrossRef]
- Pang, X.; Wan, B.; Li, H.; et al. MR-LDA: an efficient topic model for classification of short text in big social data. International Journal of Grid High Performance Computing 2016, 8, 100–113. [Google Scholar] [CrossRef]
- Debortoli, S.; Müller, O.; Junglas, I.; et al. Text Mining for Information Systems Researchers: An Annotated Topic Modeling Tutorial. Commun. Assoc. Inf. Syst. 2016, 39, 110–135. [Google Scholar] [CrossRef]






| ChatGPT | Gemini | Kimi | Ernie Bot | |||||
|---|---|---|---|---|---|---|---|---|
| Sort | Word | Frequency | Word | Frequency | Word | Frequency | Word | Frequency |
| 1 | COVID-19 | 556 | COVID-19 | 202 | COVID-19 | 253 | COVID-19 | 283 |
| 2 | Patient | 328 | Medical | 148 | Test | 191 | Patient | 194 |
| 3 | Infect | 293 | Health | 100 | Risk | 180 | Infect | 152 |
| 4 | Risk | 284 | Healthcare | 95 | Patient | 171 | Risk | 140 |
| 5 | Test | 277 | Risk | 95 | Vaccine | 154 | Test | 131 |
| 6 | Severe | 233 | Advice | 89 | Infect | 137 | Symptom | 112 |
| 7 | Symptom | 231 | Infected | 86 | Severe | 131 | Severe | 108 |
| 8 | Individual | 201 | Test | 73 | Symptom | 130 | Healthcare | 105 |
| 9 | Vaccine | 201 | Individual | 69 | Recommend | 103 | Medical | 99 |
| 10 | Viral | 200 | Symptom | 67 | Sars-cov | 103 | Health | 93 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).