Preprint Article Version 1 Preserved in Portico This version is not peer-reviewed

Substantiation of the Sample Size for Assessing a Diagnostic Accuracy of Software Based on AI Technology for Diagnostic Radiology

Version 1 : Received: 14 December 2023 / Approved: 15 December 2023 / Online: 15 December 2023 (06:18:00 CET)

How to cite: N.Yu., N.; K.M., A.; Yu.A., V.; T.M., B.; S.F., C.; O.V., O.; A.V., V. Substantiation of the Sample Size for Assessing a Diagnostic Accuracy of Software Based on AI Technology for Diagnostic Radiology. Preprints 2023, 2023121136. https://doi.org/10.20944/preprints202312.1136.v1 N.Yu., N.; K.M., A.; Yu.A., V.; T.M., B.; S.F., C.; O.V., O.; A.V., V. Substantiation of the Sample Size for Assessing a Diagnostic Accuracy of Software Based on AI Technology for Diagnostic Radiology. Preprints 2023, 2023121136. https://doi.org/10.20944/preprints202312.1136.v1

Abstract

In the literature, the question of the amount of data necessary and sufficient to validate different models of the occurrence of risk of adverse events for patients or the classification of the presence or absence of pathological features has been repeatedly raised. In the presented study, we propose a new approach to determine the necessary and enough studies for validation of medical software based on artificial intelligence technology, whose main task is to classify medical X-rays according to the presence of normality and pathology. It is shown that for several studies in a dataset, when AUC ROC has maximum heterogeneity, it varies depending on the balance of "norm"/"pathology" classes. Thus, for a balance of "normal"/"abnormality", where 90% is "normal" and 10% is "abnormality", maximum heterogeneity is achieved for 190 studies, for a balance of 80% ("normal")/20% ("abnormality") for 80 studies, for a balance of 70% ("norm")/30% ("abnormality") - 120 trials, for a balance of 60% ("norm")/40% ("abnormality") the maximum heterogeneity is reached at 110 trials, and for a balance of 50% ("norm")/50% ("abnormality") - at 70 trials. The obtained data are in good agreement with the previous results. They allow us to determine a sufficient (necessary) number of studies in the dataset to perform an unbiased assessment of AUC ROC.

Keywords

number of samples; AUC ROC; AI-technology; X-rays according; diagnostic radiology; diagnostic accuracy

Subject

Computer Science and Mathematics, Computational Mathematics

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.

Leave a public comment
Send a private comment to the author(s)
* All users must log in before leaving a comment
Views 0
Downloads 0
Comments 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.