Submitted:
07 November 2025
Posted:
10 November 2025
You are already at the latest version
Abstract
Keywords:
1. Introduction
2. Materials and Methods
3. Results
4. Discussion
5. Conclusions
Supplementary Materials
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
References
- Ausayakhun, S.; Snyder, B.M.; Ausayakhun, S.; Nanegrungsunk, O.; Apivatthakakul, A.; Narongchai, C.; Melo, J.S.; Keenan, J.D. Clinic-Based Eye Disease Screening Using Non-Expert Fundus Photo Graders at the Point of Screening: Diagnostic Validity and Yield. Am. J. Ophthalmol. 2021, 227, 245–253. [Google Scholar] [CrossRef] [PubMed]
- Chasan, J.E.; Delaune, B.; Maa, A.Y.; Lynch, M.G. Effect of a teleretinal screening program on eye care use and resources. JAMA Ophthalmol. 2014, 132, 1045–1051. [Google Scholar] [CrossRef] [PubMed]
- Chou, R.; Dana, T.; Bougatsos, C.; Grusing, S.; Blazina, I. In Screening for Impaired Visual Acuity in Older Adults: A Systematic Review to Update the 2009 U.S. Preventive Services Task Force Recommendation; U.S. Preventive Services Task Force Evidence Syntheses, formerly Systematic Evidence Reviews: Rockville, MD, USA, 2016. [Google Scholar]
- Force, U.S.P.S.T.; Mangione, C.M.; Barry, M.J.; Nicholson, W.K.; Cabana, M.; Chelmow, D.; Coker, T.R.; Davis, E.M.; Donahue, K.E.; Epling, J.W., Jr.; et al. Screening for Impaired Visual Acuity in Older Adults: US Preventive Services Task Force Recommendation Statement. JAMA 2022, 327, 2123–2128. [Google Scholar] [CrossRef]
- Perez-de-Arcelus, M.; Andonegui, J.; Serrano, L.; Eguzkiza, A.; Maya, J.R. Diabetic retinopathy screening by general practitioners using non-mydriatic retinography. Curr. Diabetes Rev. 2013, 9, 2–6. [Google Scholar] [CrossRef]
- Sherman, E.; Niziol, L.M.; Hicks, P.M.; Johnson-Griggs, M.; Elam, A.R.; Woodward, M.A.; Bicket, A.K.; Wood, S.D.; John, D.; Johnson, L.; et al. A Screening Strategy to Mitigate Vision Impairment by Engaging Adults Who Underuse Eye Care Services. JAMA Ophthalmol. 2024, 142, 909–916. [Google Scholar] [CrossRef]
- Song, A.; Lusk, J.B.; Roh, K.M.; Jackson, K.J.; Scherr, K.A.; McNabb, R.P.; Chatterjee, R.; Kuo, A.N. Practice Patterns of Fundoscopic Examination for Diabetic Retinopathy Screening in Primary Care. JAMA Netw. Open 2022, 5, e2218753. [Google Scholar] [CrossRef]
- Weinreb, R.N.; Lee, A.Y.; Baxter, S.L.; Lee, R.W.J.; Leng, T.; McConnell, M.V.; El-Nimri, N.W.; Rhew, D.C. Application of Artificial Intelligence to Deliver Healthcare From the Eye. JAMA Ophthalmol. 2025, 143, 529–535. [Google Scholar] [CrossRef]
- Chaurasia, A.K.; Greatbatch, C.J.; Hewitt, A.W. Diagnostic Accuracy of Artificial Intelligence in Glaucoma Screening and Clinical Practice. J. Glaucoma 2022, 31, 285–299. [Google Scholar] [CrossRef]
- Heidari, Z.; Hashemi, H.; Sotude, D.; Ebrahimi-Besheli, K.; Khabazkhoob, M.; Soleimani, M.; Djalilian, A.R.; Yousefi, S. Applications of Artificial Intelligence in Diagnosis of Dry Eye Disease: A Systematic Review and Meta-Analysis. Cornea 2024, 43, 1310–1318. [Google Scholar] [CrossRef]
- Lam, C.; Wong, Y.L.; Tang, Z.; Hu, X.; Nguyen, T.X.; Yang, D.; Zhang, S.; Ding, J.; Szeto, S.K.H.; Ran, A.R.; et al. Performance of Artificial Intelligence in Detecting Diabetic Macular Edema From Fundus Photography and Optical Coherence Tomography Images: A Systematic Review and Meta-analysis. Diabetes Care 2024, 47, 304–319. [Google Scholar] [CrossRef]
- Mikhail, D.; Gao, A.; Farah, A.; Mihalache, A.; Milad, D.; Antaki, F.; Popovic, M.M.; Shor, R.; Duval, R.; Kertes, P.J.; et al. Performance of Artificial Intelligence-Based Models for Epiretinal Membrane Diagnosis: A Systematic Review and Meta-Analysis. Am. J. Ophthalmol. 2025, 277, 420–432. [Google Scholar] [CrossRef] [PubMed]
- Qian, B.; Sheng, B.; Chen, H.; Wang, X.; Li, T.; Jin, Y.; Guan, Z.; Jiang, Z.; Wu, Y.; Wang, J.; et al. A Competition for the Diagnosis of Myopic Maculopathy by Artificial Intelligence Algorithms. JAMA Ophthalmol. 2024, 142, 1006–1015. [Google Scholar] [CrossRef] [PubMed]
- Lee, J.; Lee, J.; Cho, S.; Song, J.; Lee, M.; Kim, S.H.; Lee, J.Y.; Shin, D.H.; Kim, J.M.; Bae, J.H. Development of decision support software for deep learning-based automated retinal disease screening using relatively limited fundus photograph data. Electronics 2021, 10, 163. [Google Scholar] [CrossRef]
- Scheetz, J.; Koca, D.; McGuinness, M.; Holloway, E.; Tan, Z.; Zhu, Z.; O'Day, R.; Sandhu, S.; MacIsaac, R.J.; Gilfillan, C.; et al. Real-world artificial intelligence-based opportunistic screening for diabetic retinopathy in endocrinology and indigenous healthcare settings in Australia. Sci. Rep. 2021, 11, 15808. [Google Scholar] [CrossRef]
- Ramachandran, N.; Schmiedel, O.; Vaghefi, E.; Hill, S.; Wilson, G.; Squirrell, D. Evaluation of the prevalence of non-diabetic eye disease detected at first screen from a single region diabetic retinopathy screening program: a cross-sectional cohort study in Auckland, New Zealand. BMJ Open 2021, 11, e054225. [Google Scholar] [CrossRef]
- Skevas, C.; de Olaguer, N.P.; Lleo, A.; Thiwa, D.; Schroeter, U.; Lopes, I.V.; Mautone, L.; Linke, S.J.; Spitzer, M.S.; Yap, D.; et al. Implementing and evaluating a fully functional AI-enabled model for chronic eye disease screening in a real clinical environment. BMC Ophthalmol. 2024, 24, 51. [Google Scholar] [CrossRef]
- Alan, D. Fleming; Sam Philip; Keith A. Goatman; John A. Olson; Peter F. Sharp; Automated Assessment of Diabetic Retinal Image Quality Based on Clarity and Field Definition. Investig. Ophthalmol. Vis. Sci. 2006, 47, 1120–1125. [Google Scholar] [CrossRef]
- Bird, A.C.; Bressler, N.M.; Bressler, S.B.; Chisholm, I.H.; Coscas, G.; Davis, M.D.; de Jong, P.T.; Klaver, C.C.; Klein, B.E.; Klein, R.; et al. An international classification and grading system for age-related maculopathy and age-related macular degeneration. The International ARM Epidemiological Study Group. Surv. Ophthalmol. 1995, 39, 367–374. [Google Scholar] [CrossRef]
- Foster, P.J.; Buhrmann, R.; Quigley, H.A.; Johnson, G.J. The definition and classification of glaucoma in prevalence surveys. Br. J. Ophthalmol. 2002, 86, 238–242. [Google Scholar] [CrossRef]
- Chicco, D.; Jurman, G. The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation. BMC Genom. 2020, 21, 6. [Google Scholar] [CrossRef]
- Vasey, B.; Nagendran, M.; Campbell, B.; Clifton, D.A.; Collins, G.S.; Denaxas, S.; Denniston, A.K.; Faes, L.; Geerts, B.; Ibrahim, M.; et al. Reporting guideline for the early-stage clinical evaluation of decision support systems driven by artificial intelligence: DECIDE-AI. Nat. Med. 2022, 28, 924–933. [Google Scholar] [CrossRef] [PubMed]
- Kwong, J.C.C.; Khondker, A.; Lajkosz, K.; McDermott, M.B.A.; Frigola, X.B.; McCradden, M.D.; Mamdani, M.; Kulkarni, G.S.; Johnson, A.E.W. APPRAISE-AI Tool for Quantitative Evaluation of AI Studies for Clinical Decision Support. JAMA Netw. Open 2023, 6, e2335377. [Google Scholar] [CrossRef] [PubMed]
- Araujo, A.L.D.; Sperandio, M.; Calabrese, G.; Faria, S.S.; Cardenas, D.A.C.; Martins, M.D.; Vargas, P.A.; Lopes, M.A.; Santos-Silva, A.R.; Kowalski, L.P.; et al. Artificial intelligence in healthcare applications targeting cancer diagnosis-part II: interpreting the model outputs and spotlighting the performance metrics. Oral. Surg. Oral. Med. Oral. Pathol. Oral. Radiol. 2025, 140, 89–99. [Google Scholar] [CrossRef] [PubMed]
- Tan, T.E.; Xu, X.; Wang, Z.; Liu, Y.; Ting, D.S.W. Interpretation of artificial intelligence studies for the ophthalmologist. Curr. Opin. Ophthalmol. 2020, 31, 351–356. [Google Scholar] [CrossRef]
- Park, S.H.; Han, K.; Jang, H.Y.; Park, J.E.; Lee, J.G.; Kim, D.W.; Choi, J. Methods for Clinical Evaluation of Artificial Intelligence Algorithms for Medical Diagnosis. Radiology 2023, 306, 20–31. [Google Scholar] [CrossRef]
- Benjamens, S.; Dhunnoo, P.; Mesko, B. The state of artificial intelligence-based FDA-approved medical devices and algorithms: an online database. NPJ Digit. Med. 2020, 3, 118. [Google Scholar] [CrossRef]
- Yau, J.W.; Rogers, S.L.; Kawasaki, R.; Lamoureux, E.L.; Kowalski, J.W.; Bek, T.; Chen, S.J.; Dekker, J.M.; Fletcher, A.; Grauslund, J.; et al. Global prevalence and major risk factors of diabetic retinopathy. Diabetes Care 2012, 35, 556–564. [Google Scholar] [CrossRef]
- Jones, C.; Thornton, J.; Wyatt, J.C. Artificial intelligence and clinical decision support: clinicians' perspectives on trust, trustworthiness, and liability. Med. Law. Rev. 2023, 31, 501–520. [Google Scholar] [CrossRef]
- Muehlematter, U.J.; Daniore, P.; Vokinger, K.N. Approval of artificial intelligence and machine learning-based medical devices in the USA and Europe (2015-20): a comparative analysis. Lancet Digit. Health 2021, 3, e195–e203. [Google Scholar] [CrossRef]
- Popa, S.L.; Ismaiel, A.; Brata, V.D.; Turtoi, D.C.; Barsan, M.; Czako, Z.; Pop, C.; Muresan, L.; Stanculete, M.F.; Dumitrascu, D.I. Artificial Intelligence and medical specialties: support or substitution? Med. Pharm. Rep. 2024, 97, 409–418. [Google Scholar] [CrossRef]

| Reader | Accuracy of AI (%) | Accuracy of readers (%) | Accuracy with AI (%) | Absolute Gain (%) | Improved Cases | Worsened Cases | NNT | p-value | Effect Size (Cohen's d) |
| Case B | 84.5 | 87.8 | 93.3 | +5.5 | 113 | 38 | 5 | <0.0001 | 0.384 |
| Case S | 84.5 | 76.1 | 82.7 | +6.6 | 198 | 83 | 9 | <0.0001 | 0.391 |
| Case C | 84.5 | 83.0 | 94.1 | +11.1 | 237 | 36 | 8 | <0.0001 | 0.767 |
| Case Y | 84.5 | 84.4 | 94.2 | +9.8 | 203 | 32 | 10 | <0.0001 | 0.681 |
| Mean | 84.5 | 82.8 | 91.1 |
| Pathology | AI Only | Physician Only | Physician + AI | ||||||
| FP/FN | F1-score | MCC | FP/FN | F1-score | MCC | FP/FN | F1-score | MCC | |
| DR | 6/6 | 0.943 | 0.928 | 59.5±29.0/22.2±10.4 | 0.670 ± 0.043 | 0.578 ± 0.051 | 21.8±14.5/14.8±17.6 | 0.830 ± 0.150 | 0.785 ± 0.190 |
| ERM | 3/29 | 0.837 | 0.809 | 29.0±31.3/43.2±35.0 | 0.594 ± 0.238 | 0.558 ± 0.178 | 12.8±5.7/26.0±21.5 | 0.805 ± 0.156 | 0.762 ± 0.179 |
| GS | 3/34 | 0.816 | 0.786 | 35.0±21.4/58.2±30.7 | 0.502 ± 0.257 | 0.396 ± 0.305 | 30.2±29.0/19.2±4.5 | 0.804 ± 0.085 | 0.745 ± 0.115 |
| MD | 3/47 | 0.706 | 0.684 | 58.2±25.6/55.0±25.1 | 0.457 ± 0.200 | 0.310 ± 0.255 | 43.8±40.2/29.2±9.6 | 0.696 ± 0.155 | 0.606 ± 0.212 |
| Normal | 6/39 | 0.151 | 0.160 | 56.0±22.3/18.2±7.5 | 0.390 ± 0.123 | 0.333 ± 0.148 | 18.2±11.1/23.2±9.9 | 0.465 ± 0.164 | 0.435 ± 0.156 |
| RVO | 15/5 | 0.896 | 0.873 | 15.2±10.7/47.8±19.5 | 0.557 ± 0.173 | 0.515 ± 0.171 | 14.0±9.4/19.0±17.4 | 0.808 ± 0.167 | 0.769 ± 0.196 |
| Pathology | Case B | Case S | Case C | Case Y | ||||||||
|
Accuracy without AI (%) |
Accuracy with AI (%) |
Absolute Gain (%p) |
Accuracy without AI (%) | Accuracy with AI (%) |
Absolute Gain (%p) |
Accuracy without AI (%) | Accuracy with AI (%) |
Absolute Gain (%p) |
Accuracy without AI (%) | Accuracy with AI (%) |
Absolute Gain (%p) |
|
| DR | 88.4 | 93.8 | +5.4 | 80.4 | 83.6 | +3.2 | 84.6 | 96.4 | +11.8 | 80.4 | 97.0 | +16.6 |
| ERM | 87.4 | 94.0 | +6.6 | 81.0 | 84.2 | +3.2 | 81.6 | 95.8 | +14.2 | 85.2 | 95.0 | +9.8 |
| GS | 80.8 | 92.2 | +11.4 | 67.8 | 81.6 | +13.8 | 83.4 | 92.6 | +9.2 | 88.0 | 94.0 | +6.0 |
| MD | 84.8 | 89.6 | +4.8 | 67.0 | 70.6 | +3.6 | 75.2 | 91.2 | +16.0 | 76.6 | 90.2 | +13.6 |
| Normal | 89.4 | 90.4 | +1.0 | 78.2 | 90.6 | +12.4 | 83.2 | 92.8 | +9.6 | 87.6 | 93.0 | +5.4 |
| RVO | 88.4 | 96.0 | +7.6 | 82.2 | 85.4 | +3.2 | 89.8 | 96.0 | +6.2 | 88.4 | 96.2 | +7.8 |
| DR + ERM | 0.0 | 14.3 | +14.3 | 0.0 | 28.6 | +28.6 | 57.1 | 42.9 | -14.3 | 57.1 | 42.9 | -14.3 |
| DR + GS | 40.0 | 40.0 | +0.0 | 0.0 | 20.0 | +20.0 | 20.0 | 60.0 | +40.0 | 40.0 | 60.0 | +20.0 |
| ERM + GS | 42.9 | 66.7 | +23.8 | 0.0 | 44.4 | +44.4 | 44.4 | 66.7 | +22.3 | 11.1 | 55.6 | +44.5 |
| ERM + RVO | 0.0 | 66.7 | +66.7 | 0.0 | 13.3 | +13.3 | 46.7 | 73.3 | +26.6 | 26.7 | 60.0 | +33.3 |
| GS + RVO | 0.0 | 80.0 | +80.0 | 0.0 | 40.0 | +40.0 | 20.0 | 70.0 | +50.0 | 30.0 | 60.0 | +30.0 |
| MD + DR | 0.0 | 0.0 | +0.0 | 0.0 | 0.0 | +0.0 | 50.0 | 50.0 | +0.0 | 50.0 | 50.0 | +0.0 |
| MD + ERM | 11.1 | 33.3 | +22.2 | 0.0 | 6.7 | +6.7 | 20.0 | 26.7 | +6.7 | 13.3 | 26.7 | +13.4 |
| MD + GS | 0.0 | 30.0 | +30.0 | 0.0 | 40.0 | +40.0 | 20.0 | 40.0 | +20.0 | 10.0 | 30.0 | +20.0 |
| MD + RVO | 0.0 | 0.0 | +0.0 | 0.0 | 33.3 | +33.3 | 33.3 | 0 | -33.3 | 0.0 | 0.0 | +0.0 |
| ERM + GS + RVO |
0 | 100.0 | +100.0 | 0.0 | 100.0 | +100.0 | 0 | 100.0 | +100.0 | 0.0 | 100.0 | +100.0 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).