Submitted:
08 October 2025
Posted:
09 October 2025
You are already at the latest version
Abstract
Keywords:
1. Introduction
2. Materials and Methods
2.1. Study Design and Settings
2.2. Establishment of Multiple-Choice Question Bank
2.3. Selection of Multiple-Choice Questions
2.4. Inclusion and Exclusion of MCQS
2.5. Preparation of Test Paper for ChatGPT
2.6. Score System
2.7. Statistical Analysis
2.8. Ethical Approval
3. Results
4. Discussion
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
| USMLE | United States Medical Licensing Examination |
| MCQs | Multiple Choice Questions |
| PBL | Problem Solving Learning |
| LLMs | Large Learning Models |
| AI | Artificial Intelligence |
References
- Au, K. F., & Yang, W. (2023). Auxiliary use of ChatGPT in surgical diagnosis and treatment. International Journal of Surgery. [CrossRef]
- Ayub, I., Hamann, D., Hamann, C. R., & Davis, M. J. (2023). Exploring the Potential and Limitations of Chat Generative Pre-Trained Transformer (ChatGPT) in Generating Board-Style Dermatology Questions: A Qualitative Analysis. Cureus. [CrossRef]
- Bolgova, O., Shypilova, I., Sankova, L., & Mavrych, V. (2023). How Well Did ChatGPT Perform in Answering Questions on Different Topics in Gross Anatomy? European Journal of Medical and Health Sciences, 5(6), 94. [CrossRef]
- Brin, D., Sorin, V., Konen, E., Nadkarni, G. N., Glicksberg, B. S., & Klang, E. (2023). How Large Language Models Perform on the United States Medical Licensing Examination: A Systematic Review [Review of How Large Language Models Perform on the United States Medical Licensing Examination: A Systematic Review]. medRxiv (Cold Spring Harbor Laboratory). Cold Spring Harbor Laboratory. [CrossRef]
- Çan, M. A., & Toraman, Ç. (2022). The effect of repetition- and scenario-based repetition strategies on anatomy course achievement, classroom engagement and online learning attitude. BMC Medical Education, 22(1). [CrossRef]
- Chan, A. Y.-C. C., Stapper, C. P., Bleys, R. L. A. W., Leeuwen, M. van, & Cate, O. ten. (2022). Are We Facing the End of Gross Anatomy Teaching as We Have Known It for Centuries? Advances in Medical Education and Practice, 1243. [CrossRef]
- Chen, X., Yi, H., You, M., Liu, W., Li, W., Li, H., Zhang, X., Guo, Y., Fan, L., Chen, G., Lao, Q., Fu, W., Li, K., & Li, J. (2025). Enhancing diagnostic capability with multi-agents’ conversational large language models. Npj Digital Medicine, 8(1). [CrossRef]
- Cheung, C. C., Bridges, S., & Tipoe, G. L. (2021). Why is Anatomy Difficult to Learn? The Implications for Undergraduate Medical Curricula. Anatomical Sciences Education, 14(6), 752. [CrossRef]
- Clusmann, J., Kolbinger, F. R., Muti, H. S., Carrero, Z. I., Eckardt, J.-N., Laleh, N. G., Löffler, C. M. L., Schwarzkopf, S.-C., Unger, M., Veldhuizen, G. P., Wagner, S. J., & Kather, J. N. (2023). The future landscape of large language models in medicine [Review of The future landscape of large language models in medicine]. Communications Medicine, 3(1). Nature Portfolio. [CrossRef]
- Farrokhi, A., Soleymaninejad, M., Ghorbanlou, M., Fallah, R., & Nejatbakhsh, R. (2017). Applied anatomy, today’s requirement for clinical medicine courses. Anatomy & Cell Biology, 50(3), 175. [CrossRef]
- Gilson, A., Safranek, C. W., Huang, T., Socrates, V., Chi, L., Taylor, R. A., & Chartash, D. (2023). How Does ChatGPT Perform on the United States Medical Licensing Examination (USMLE)? The Implications of Large Language Models for Medical Education and Knowledge Assessment. JMIR Medical Education, 9. [CrossRef]
- Gotta, J., Qiao, H., Koch, V., Gruenewald, L. D., Geyer, T., Martin, S. S., Scholtz, J., Booz, C., Santos, D. P. dos, Mahmoudi, S., Eichler, K., Gruber--Rouh, T., Hammerstingl, R., Biciusca, T., Juergens, L. J., Höhne, E., Mader, C., Vogl, T. J., & Reschke, P. (2024). Large language models (LLMs) in radiology exams for medical students: Performance and consequences. RöFo—Fortschritte Auf Dem Gebiet Der Röntgenstrahlen Und Der Bildgebenden Verfahren. [CrossRef]
- Grévisse, C. (2024). LLM-based automatic short answer grading in undergraduate medical education. BMC Medical Education, 24(1). [CrossRef]
- Hariri, W. (2023). Analyzing the Performance of ChatGPT in Cardiology and Vascular Pathologies. Research Square (Research Square). [CrossRef]
- Ilgaz, H. B., & Çelik, Z. (2023). The Significance of Artificial Intelligence Platforms in Anatomy Education: An Experience With ChatGPT and Google Bard. Cureus. [CrossRef]
- Kasneci, E., Seßler, K., Küchemann, S., Bannert, M., Dementieva, D., Fischer, F., Gasser, U., Groh, G., Günnemann, S., Hüllermeier, E., Krusche, S., Kutyniok, G., Michaeli, T., Nerdel, C., Pfeffer, J., Poquet, O., Sailer, M., Schmidt, A., Seidel, T., … Kasneci, G. (2023). ChatGPT for good? On opportunities and challenges of large language models for education. Learning and Individual Differences, 103, 102274. [CrossRef]
- Knoedler, L., Knoedler, S., Hoch, C. C., Prantl, L., Frank, K., Soiderer, L., Cotofana, S., Dorafshar, A. H., Schenck, T. L., Vollbach, F. H., Sofo, G., & Alfertshofer, M. (2024). In-depth analysis of ChatGPT’s performance based on specific signaling words and phrases in the question stem of 2377 USMLE step 1 style questions. Scientific Reports, 14(1). [CrossRef]
- Krive, J., Isola, M., Chang, L., Patel, T., Anderson, M. C., & Sreedhar, R. (2023). Grounded in reality: artificial intelligence in medical education. JAMIA Open, 6(2). [CrossRef]
- Kung, T. H., Cheatham, M., Medenilla, A., Sillos, C., Leon, L. D., Elepaño, C., Madriaga, M., Aggabao, R., Diaz-Candido, G., Maningo, J., & Tseng, V. (2023). Performance of ChatGPT on USMLE: Potential for AI-assisted medical education using large language models. PLOS Digital Health, 2(2). [CrossRef]
- Lai, U. H., Wu, K. S., Hsu, T.-Y., & Kan, J. K. C. (2023). Evaluating the performance of ChatGPT-4 on the United Kingdom Medical Licensing Assessment. Frontiers in Medicine, 10. [CrossRef]
- Meo, S. A., Al--Masri, A. A., Alotaibi, M., Meo, M. Z. S., & Meo, M. O. S. (2023). ChatGPT Knowledge Evaluation in Basic and Clinical Medical Sciences: Multiple Choice Question Examination-Based Performance. Healthcare, 11(14), 2046. [CrossRef]
- Mir, R., Mir, G. M., Raina, N. T., Mir, S., Mir, S. M., Miskeen, E., Alharthi, M. H., & Alamri, M. M. S. (2023). Application of Artificial Intelligence in Medical Education: Current Scenario and Future Perspectives. [Review of Application of Artificial Intelligence in Medical Education: Current Scenario and Future Perspectives.]. PubMed, 11(3), 133. National Institutes of Health. [CrossRef]
- Mishra, V., Lurie, Y., & Mark, S. (2025). Accuracy of LLMs in medical education: evidence from a concordance test with medical teacher. BMC Medical Education, 25(1). [CrossRef]
- Narayanan, S., Rajprasath, R., Durairaj, E., & Das, A. (2023). Artificial Intelligence Revolutionizing the Field of Medical Education. Cureus. [CrossRef]
- Papa, V., & Vaccarezza, M. (2013). Teaching Anatomy in the XXI Century: New Aspects and Pitfalls [Review of Teaching Anatomy in the XXI Century: New Aspects and Pitfalls]. The Scientific World JOURNAL, 2013(1). Hindawi Publishing Corporation. [CrossRef]
- Roos, J., Kasapovic, A., Jansen, T., & Kaczmarczyk, R. (2023). Artificial Intelligence in Medical Education: Comparative Analysis of ChatGPT, Bing, and Medical Students in Germany. JMIR Medical Education, 9. [CrossRef]
- Rosoł, M., Gąsior, J. S., Łaba, J., Korzeniewski, K., & Młyńczak, M. (2023). Evaluation of the performance of GPT-3.5 and GPT-4 on the Polish Medical Final Examination. Scientific Reports, 13(1). [CrossRef]
- Saroha, S. (2025). Artificial Intelligence in Medical Education: Promise, Pitfalls, and Practical Pathways. Advances in Medical Education and Practice, 1039. [CrossRef]
- Scherr, R., Halaseh, F. F., Spina, A., Andalib, S., & Rivera, R. (2023). ChatGPT Interactive Medical Simulations for Early Clinical Education: Case Study. JMIR Medical Education, 9. [CrossRef]
- Sharma, P., Thapa, K., Dhakal, P., Upadhaya, M. D., Adhikari, S., & Khanal, S. R. (2023). Performance of ChatGPT on USMLE: Unlocking the Potential of Large Language Models for AI-Assisted Medical Education. arXiv (Cornell University). [CrossRef]
- Singal, A., Goyal, S. Reliability and efficiency of ChatGPT 3.5 and 4.0 as a tool for scalenovertebral triangle anatomy education. Surg Radiol Anat 47, 24 (2025). [CrossRef]
- Zhang, Y., Ji, Z., Zhou, P., Dong, L., & Chen, Y. (2023). Clinical anatomy teaching: A promising strategy for anatomic education. Heliyon, 9(3). [CrossRef]




| Disciplines of Basic Medical Sciences | MCQs Pool | Number of MCQs Selected |
|---|---|---|
| Anatomy | 42 | 28 |
| Histology | 38 | 23 |
| Microbiology | 57 | 21 |
| Pathology | 62 | 33 |
| Physiology | 20 | 19 |
| Total | 219 | 124 |
| Disciplines | Total MCQs | Chat GPT Marks | Accuracy % | 95% CI |
|---|---|---|---|---|
| Anatomy | 28 | 27 | 96.4% | 87.7–99.0% |
| Histology | 23 | 23 | 100% | 85.7–100% |
| Microbiology | 21 | 20 | 95.2% | 79.8–99.3% |
| Pathology | 33 | 32 | 97% | 88.8–99.4% |
| Physiology | 19 | 19 | 100% | 82.4–100% |
| Total | 124 | 121 | 97.6% (98%) | 93.4–99.1% |
| Disciplines | Total MCQs | Chat GPT Correct | Errors | Mode of Failure |
|---|---|---|---|---|
| Anatomy | 28 | 27 | 1 | Occasional minor mistakes |
| Histology | 23 | 23 | 0 | Perfect performance |
| Microbiology | 21 | 20 | 1 | Occasional minor mistakes |
| Pathology | 33 | 32 | 1 | Occasional minor mistakes |
| Physiology | 19 | 19 | 0 | Perfect performance |
| Total | 124 | 121 | 3 | Occasional minor mistakes (overall very high) |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).