Artificial Intelligence in Obstetrics: Current Trends and Future Directions

Ittai T Many; Ariel Many

doi:10.20944/preprints202606.1633.v1

Submitted:

22 June 2026

Posted:

23 June 2026

You are already at the latest version

Abstract

Recent advances in artificial intelligence (AI), especially machine learning (ML), deep learning (DL), natural language processing (NLP) and computer vision have rapidly impacted obstetric care. Key applications include automated ultrasound interpretation (biometry, anomaly detection), AI-enhanced fetal monitoring (cardiotocography), risk stratification (preeclampsia, preterm birth, hemorrhage), labor management (delivery mode prediction), genomic screening (NIPT interpretation), and remote/telehealth tools for monitoring especially in underserved areas. For example, DL models now attain accuracy comparable to experts for fetal ultrasound biometry[1], and FDA-cleared AI tools have achieved >97% detection of congenital heart defects during prenatal screening[2]. However, clinical readiness varies: some technologies (e.g. ultrasound AI tools) are already FDA-cleared[2,3], whereas others remain at proof-of-concept. Across studies, performance metrics (AUC, R2, accuracy) are generally high (often >0.85) but depend on data quality and label definitions[4,5]. Crucial issues include dataset biases, lack of standardization, explainability, and regulatory oversight. This review synthesizes AI technologies, applications, validation (metrics, datasets), deployment (trials, approval, integration), and ethical considerations, and identifies knowledge gaps. Overall, AI shows promise to improve prenatal diagnosis and individualized care, but requires rigorous validation, transparent algorithms, and clinician oversight to ensure safety, equity and trust.

Keywords:

AI in OBGYN

;

echocardiography

;

CTG

;

labor prediction

;

image analysis

Subject:

Medicine and Pharmacology - Obstetrics and Gynaecology

Introduction

Artificial intelligence (AI) broadly encompasses Machine Learning (ML) (traditional algorithms like logistic regression, random forest), deep learning (neural networks such as CNNs and RNNs), computer vision (image analysis), and Natural Language processing (NLP) is transforming obstetrics. ML and Deep Learning (DL) can automatically extract complex patterns from large datasets (images, signals, EHR) to support diagnosis and prediction [6,7]. For example, AI algorithms have learned to identify subtle ultrasound features associated with congenital anomalies that may elude clinicians [8]. Key AI tools in obstetrics include convolutional neural networks for ultrasound image analysis, recurrent models for time-series (e.g., CTG signals), ensemble learners for risk prediction, and now large language models (LLMs) for information synthesis and patient communication. In obstetrics, these technologies have been applied across the pregnancy continuum, from early screening and anomaly detection to intrapartum monitoring and postpartum risk assessment.

Emerging literature shows that AI systems can approach or exceed human-level performance in many obstetric tasks. For instance, a multi-center study demonstrated that DL models placed ultrasound biometry calipers with “human-level performance” in 20-week scans, providing calibrated confidence intervals for measurements [1]. Similarly, DL has achieved 96.3% accuracy classifying normal vs. abnormal fetal brain ultrasound [9]. In clinical screening, FDA-cleared AI tools have increased fetal congenital heart defect detection to >97% sensitivity, outperforming unaided expert review [2]. Furthermore, AI models have generated delivery-date predictions from routine sonograms with an R2 up to 0.92 [5,10]. These successes suggest AI’s potential to enhance diagnostic precision, standardize interpretation, and shorten workflow time in perinatal imaging.

However, many obstetric AI studies are retrospective or just proof of concept. Few have undergone rigorous prospective trials or independent validation. Real world deployment raises challenges. Performance can vary with data quality, patient demographics, and equipment. Ethical and regulatory frameworks are still evolving, and the fact that models are inherently non-explainable, sometimes referred to as “black-box”—lingers trust gain by physicians and regulators. This review critically examines the state of AI in obstetrics, covering technologies, applications (prenatal screening, ultrasound analysis, fetal monitoring/CTG, labor and delivery decision support, maternal risk stratification, genomics/NIPT, telehealth), as well as validation methods, performance metrics, bias and fairness, explainability, clinical trials, regulatory approvals, workflow integration, privacy/ethics, and future research priorities.

Methods

We performed a structured literature search to identify relevant studies from January 2016 through March 2026 (with inclusion of very few earlier reports). Using PubMed, Embase, and Google Scholar, we combined terms such as “artificial intelligence”, “machine learning”, “deep learning”, “computer vision”, “natural language processing” with obstetric keywords (e.g., “prenatal screening”, “ultrasound”, “CTG”, “preeclampsia”, “telemedicine”). We also reviewed key articles and relevant guidelines (WHO, ACOG, RCOG). Inclusion criteria were peer-reviewed articles, systematic reviews, and major reports focusing on AI applications in obstetric care (imaging, monitoring, genomics, decision support, etc.). We prioritized sources from major journals and conferences (e.g., Ultrasound Obstet Gynecol, Nat Med, npj Digital Medicine, BJOG, JAMA), as well as AI in medicine venues (Nature, Science, IEEE). Non-English articles, opinion pieces without data, or studies unrelated to clinical obstetrics were excluded. The search followed PRISMA guidelines [11], and data extraction focused on AI methods used, datasets, performance metrics (AUC, accuracy, sensitivity, R2, etc.), and validation strategies. Reported findings were synthesized qualitatively. This is not a meta-analysis.

Key AI Technologies

AI in obstetrics leverages several core technologies:

Machine Learning (ML): Traditional ML methods (logistic regression, decision trees, random forest, support vector machines) are widely used for risk modeling. Their strengths include interpretability and lower data needs, but they often plateau in complex pattern recognition. Many risk-prediction models (e.g., for pre-eclampsia or hemorrhage) use ML classifiers or ensemble methods [4,12].

Deep Learning (DL): DL—especially convolutional neural networks (CNNs)—excels in image analysis (e.g., ultrasound frames) and signal patterns. Recurrent neural networks (RNNs) and attention-based models are employed for time-series data (e.g., CTG tracing). DL models automatically learn features, often surpassing traditional ML in complex tasks; for example, CNNs achieved >90% accuracy in fetal anatomical image classification [9]. However, they require large datasets and are often “black boxes”—users know the input and the output but the in-between process remains mostly non-explainable.

Computer Vision: computer vision techniques analyze imaging data. In obstetrics, these methods detect standard ultrasound planes (e.g., head, abdomen), measure biometry (head circumference, femur length), and flag fetal anomalies. Examples include automated caliper placement (fetal biometry) [1] and segmentation of fetal structures. Advanced vision models (like U-Nets) enable real-time ultrasound guidance and volumetric measurements.

Natural Language Processing (NLP) and Large Language Models (LLMs): Although traditional NLP applications remain relatively under-explored in obstetrics, the rapid evolution of LLMs offers significant promise for clinical workflows. Recent language models have demonstrated remarkable proficiency in handling complex medical data; for instance, one study noted that ChatGPT outperformed human candidates in official OBGYN examinations and generated diagnostic recommendations comparable to domain experts [13]. Additionally, LLMs hold substantial potential for clinical decision support, including automating patient education and summarizing complex clinical guidelines or dense EHRs. However, integrating these models into obstetric practice demands strict caution. The inherent risk of AI “hallucination”—where a model generates factually incorrect or fabricated medical information with high confidence—poses severe clinical safety risks where accurate, evidence-based guidance is essential.

Collectively, these AI technologies can be applied to diverse obstetric data types (images, signals, text) to support decision-making. Their integration should be guided by domain knowledge and clinical needs, as emphasized by guidelines recommending multidisciplinary collaboration and transparency [14].

Figure 1. Timeline of major AI-related milestones in obstetrics (selected examples with references).

Applications of AI in Obstetrics

Prenatal Screening and Ultrasound Imaging

Fetal Anomaly Detection: Computer vision and DL have been used to detect structural anomalies from ultrasound. For example, Xie et al. (2020) developed a CNN-based pipeline to segment fetal brain structures and classify scans as normal or abnormal. The model achieved 96.3% classification accuracy (sensitivity 96.9%, specificity 95.9%, AUC 0.989) [9]. Similarly, CNNs have been applied to first-trimester ultrasound frames to screen for limb, facial, and cardiac anomalies, often outperforming manual review. Commercial software (e.g., quantusFLM) can now analyze standard 2D images for fetal lung maturity, predicting neonatal respiratory distress with roughly 86–90% accuracy in multi-center trials [16]. These tools reduce operator dependence and can flag subtle findings for expert review.

Biometry and Growth Monitoring: AI models automate fetal measurements (e.g., head circumference, femur length). Venturini et al. trained a whole-examination CNN using 48 million video frames; its biometric estimates matched expert sonographers (“human-level performance”) and provided calibrated uncertainty intervals [1]. Portable ultrasound combined with AI (e.g., phone-based probes) has been validated in resource-limited settings: in rural Zambia, a mobile AI achieved GA estimates with mean error of ~±4.5 days versus standard ultrasound, and accurately identified fetal presentation (AUC 0.977) [15]. AI-enabled ultrasound could democratize access, allowing midwives/nurses to perform reliable scans [3] and increase access to these technologies in rural areas.

Prenatal Screening (Genomic): AI is emerging in non-invasive prenatal testing (NIPT) and genomics. By analyzing cell-free fetal DNA sequencing data with ML, AI can improve detection of aneuploidies. One review notes AI can “help accurately predict Down syndrome” from maternal blood DNA patterns [17]. Polygenic risk scores for preterm birth or fetal conditions might also be refined by AI. Still, genomics-based AI is early-stage, with limited large-scale obstetric trials to date.

Fetal Echocardiography: AI assistance in fetal cardiac screening is proving effective. An FDA-approved deep-learning tool now auto-identifies key cardiac views and anomalies in standard prenatal scans. Clinical use (e.g., Mount Sinai 2025) increased detection of critical congenital heart defects from baseline to >97% [2], reducing missed diagnoses. The AI also cut review time by ~18% and increased reader confidence [2].

Fetal Monitoring (Cardiotocography)

Intrapartum fetal heart rate (FHR) and uterine contraction monitoring (CTG) have long suffered from subjective interpretation. AI promises objective analysis. Recent systematic reviews stress AI’s role in standardizing CTG interpretation [18]. DL models trained on CTG signals with outcomes (umbilical pH, Apgar) have achieved moderate success. For example, Chiou et al. (2025) trained a CNN on 552 CTGs with paired cord blood pH: the best model had AUROC ~0.68 for pH < 7.20 (similar to clinicians) [19]. Interestingly, using objective labels (pH) outperformed subjective Apgar labels. A large review notes that AI for CTG often yields AUROCs in the 0.60–0.70 range [18,20] which is better than chance, but not yet definitive. This suggests current limitations originating from small data sets and noisy signals. Explainable AI tools (e.g., saliency maps) are being introduced to highlight CTG features driving decisions. Overall, AI based CTG must address bias (varying by hospital/population) and be validated prospectively [18,21].

Besides CTG, novel sensors (e.g., fetal ECG) coupled with AI may offer future monitoring enhancements. Some proof-of-concept studies use ML on noninvasive ECG or Doppler signals for early distress prediction, though clinical translation is pending.

Labor and Delivery Prediction/Decision Support

AI can assist decisions around labor and delivery. For example, ML models have been trained to predict the likelihood of cesarean delivery versus vaginal birth. A recent systematic review (17 studies) found that ensemble and neural models often achieved AUCs up to ~0.90–0.93 in predicting delivery mode (VBAC success, emergency C-section, etc.) [20]. Importantly, models incorporating real time labor data (cervical dilation, contraction patterns) outperformed those using only baseline factors [20,22]. For instance, gradient boosting models using partograph data reached AUC > 0.90 in one cohort [23]. In general, AI models substantially exceeded traditional logistic regression (which often AUC ~0.6–0.7) when dynamic intrapartum data was used [22]. Notably, simpler models remained competitive when ease of interpretation was prioritized.

AI has also been applied to labor induction success, epidural response, and postpartum hemorrhage risk during delivery. For example, one predictive tool (XGBoost) using admission vital signs and history predicted PPH (blood loss ≥ 1000 mL) with C index ≈ 0.93, outperforming clinician risk lists [12]. Similarly, early warning systems using ML on vital sign trends can flag deteriorating patients sooner than standard monitoring.

These decision support tools are currently investigational. Clinical trials embedding AI predictions into labor management pathways are few. Integration into practice will require demonstrating impact on outcomes (e.g., reduced unplanned C sections) without increasing interventions.

Maternal Risk Stratification

AI is increasingly used to predict antenatal complications and maternal morbidity. The largest focus has been on pre-eclampsia: systematic reviews report ML models (random forest, XGBoost, elastic net, etc.) achieving AUC ~0.86–0.97 using first trimester and routine data [4]. For example, a Taiwanese cohort (n-2,400) found XGBoost predicted any kind of pre-eclampsia with AUC ≈ 0.92 using blood pressure, BMI and routine laboratory results [24]. Similarly, ML has been applied to gestational diabetes, premature rupture of membranes, and other conditions, often leveraging routine antenatal records. In resource rich settings, incorporation of biomarkers (e.g., PAPP-A, placental growth factor) alongside clinical data further boosts model accuracy.

Other risk areas include postpartum hemorrhage: using a US registry of 228,000 births, ML models (XGBoost, random forest) at admission predicted PPH (EBL ≥ 1000 mL) with C index ≈0.92–0.95, notably better than logistic regression (~0.87) [12]. These models identified both known risk factors (placenta previa, macrosomia) and subtle interactions. Composite maternal morbidity (e.g., transfusion, ICU admission) is another target: ML on large EHRs can flag high-risk patients at triage. Importantly, risk stratification AI should not replace but augment clinical judgment, for example, by prompting preventive measures (aspirin, magnesium, frequent followup) in high-risk women.

Telemedicine and Resource-Limited Settings

AI can enhance remote obstetric care. During the COVID-19 pandemic, tele-OB/GYN usage surged over 500%, with AI assisted triage cited as a key modality [25]. In low resource areas lacking specialists, AI driven mobile ultrasound allows trained non-experts to perform scans. The Clarius OB AI is FDA cleared for handheld fetal biometry, explicitly aiming at rural/midwife settings [3]. By highlighting anatomy and auto-placing calipers, it reduces required expertise [3]. Similarly, AI-enabled tele-ultrasound (expert review of AI annotated images) has improved access in remote clinics.

Other telehealth uses include AI chatbots for patient education and symptom triage. LLMs could power digital assistants for pregnant women, answering questions about medications or warning signs. For example, a chatbot was shown to handle OBGYN queries accurately (with oversight) and could help with medication reconciliation or lifestyle advice. However, tele AI must be used judiciously: barriers include electricity, internet access gaps (43% of rural Sub-Saharan clinics lack stable broadband) and digital literacy [26]. The telemedicine literature warns of equity issues, interventions must be designed to avoid widening disparities [27].

Data, Validation, and Performance Metrics

Datasets are the foundation of obstetric AI. However, public obstetric data sets are scarce and often small. Common sources include large clinical registries (e.g., NICHD Maternal Fetal Medicine Network data), public CTG archives (Oxford CTG database), and proprietary hospital archives. For ultrasound, commercial firms train on hundreds of thousands to millions of images: one delivery date AI was trained on ~1.17 million images from 5700 women [5]. For CTG and labor, data sets are usually a few hundred records (e.g., Chiou’s 552 CTGs [19]) due to data collection challenges. Genomic models use sequencing databases of variable size.

Training on multi center data is crucial for generalizability. External validity is essential. Several studies showed that models often perform worse on external cohorts due to population or equipment differences. For example, CTG models exhibit performance drop when applied to a different hospital [28]. To mitigate this, federated learning where models train locally and share parameters is being explored. In reproductive medicine, a federated model outperformed hospital specific models and preserved patient privacy [29]. Similar federated approaches could allow sharing obstetric data across hospitals without raw data exchange, potentially improving AI performance and equity.

Validation of AI tools must be rigorous. Key metrics include area under the ROC curve (AUC), accuracy, sensitivity/specificity, F1 score, positive predictive value, and regression R2 (for continuous outcomes like GA). For risk scores, calibration (observed vs. predicted risk) is also assessed. Every study should use separate test sets and, preferably, external validation or cross validation to guard against overfitting. In practice, many models report cross validated AUCs (e.g., 0.92 for XGBoost in pre-eclampsia [24]) but lack independent prospective testing.

Comparisons with clinician performance or established tools are instructive. For instance, AI fetal biometrics were compared to sonographer measurements [1]; AI CTG models are often benchmarked against expert review. In many cases, AI significantly outperforms baseline logistic models. For example, ML for PPH achieved AUC~0.93 vs. logistic 0.87 [12], and delivery mode AI reached ~0.93 vs. ~0.75 for basic risk scores [20].

Bias, Explainability, and Ethics

AI systems must be scrutinized for bias. Training data often underrepresents certain groups (e.g., minority ethnicities, extreme BMI), risking skewed performance. For instance, if an AI CTG model is trained on data mostly from one population, it may not generalize to others. Guidelines explicitly call for “mitigating algorithmic bias” and transparent communication with patients [14]. Explainability tools (SHAP values, saliency maps) are increasingly used. For example, SHAP analysis in XGBoost pre-eclampsia models identified top predictors (blood pressure, glucose) and can reveal when a model over-relies on confounders [30].

However, explainability has limits: complex DL models remain opaque. Regulatory bodies and ethicists stress human oversight. A recent perspective warns that AI risk labels (e.g., “high-risk pregnancy”) can cause anxiety or stigma if not communicated properly [31,32]. There is a risk of reinforcing disparities: for example, “over-flagging socially disadvantaged groups” can occur if historical biases in care or data are encoded in AI [31]. Therefore, ethics frameworks for obstetric AI emphasize fairness audits, subgroup calibration, and combining risk scores with patient counseling [31,32].

Data privacy is another major concern. Prenatal data are sensitive (genomics, images, EHR). Regulations like HIPAA (US) and GDPR (EU) apply. AI development should incorporate privacy safeguards: de-identification, secure data storage, and when possible, privacy-preserving techniques (federated learning [29], differential privacy). A scoping review noted that fragmented policies and unclear data governance are barriers to AI uptake in OB/GYN telehealth [33]. Ensuring patient consent and transparency about AI use are ethical imperatives.

Finally, patient autonomy and clinician accountability must be maintained. AI should support, not replace, obstetric judgment. Liability for AI-driven decisions is a legal grey area; current guidance suggests AI be treated as decision support rather than an autonomous actor. Providers must verify AI recommendations, and institutions need policies on oversight. Ongoing ethics research calls for interdisciplinary input (clinicians, data scientists, ethicists, patient advocates) in AI tool design and deployment.

Clinical Trials and Regulatory Approval

Most AI in obstetrics remains at the research stage; few have formal regulatory clearance. Notable exceptions are those integrated into medical devices:

FDA-cleared tools: The FDA has begun approving AI devices for obstetric use. In 2024 Clarius received clearance for its OB AI (handheld ultrasound for fetal measurements) [3]. In 2025, an AI system for fetal echocardiography (Cinc’s DxOne Fetal™) became FDA-authorized, enhancing detection of heart defects [2]. In early 2026, the “Delivery Date AI” system received De Novo FDA approval for predicting delivery timing from ultrasound [10]. These clearances require demonstration of safety and effectiveness (usually in clinical studies), but often under a “software as a medical device” pathway.

European approval: Many AI ultrasound aids are CE marked (via MDR or IVDR compliance), for example GE’s AI based fetal echocardiography assist tool or Samsung’s auto-biometry. However, CE marking does not guarantee all performance claims; clinicians must rely on published evaluations.

Clinical trials: Few prospective trials of AI in OB have been completed. One example (ongoing) is the ‘PERFORM’ study comparing LLMs to residents (abstract only, no results yet) [34]. Most evidence is retrospective or cross-sectional. There is a pressing need for randomized or prospective implementation studies to assess impact on outcomes (e.g., does AI screening reduce missed diagnoses or unnecessary interventions?). Any AI incorporated into clinical care should ideally undergo phased trials, starting with pilot feasibility and moving to larger effectiveness trials.

Guidelines and standards: Professional bodies have not yet issued OB specific AI guidelines (as of 2026), but general digital health frameworks apply. Research adherence to TRIPOD (for prediction models) and CONSORT-AI/DECIDE-AI (for clinical trials) is increasingly expected. Reporting standards (e.g., clearly stating data sources, validation methods) are critical to ensure reproducibility.

Integration into Clinical Workflows

Integrating AI tools into everyday obstetric care requires workflow compatibility and user training. For example, AI-assisted ultrasound often presents as an optional overlay on existing ultrasound machines or smartphone apps. Studies report that providers can adapt to AI guidance with brief training, which can also serve as a teaching aid [3]. Importantly, AI suggestions (e.g., boundaries for fetal structures) should be easily toggled so clinicians retain control.

In fetal monitoring, AI outputs should integrate with electronic fetal monitoring (EFM) consoles or EHR alerts. User interface design (clear risk scores, trend graphs) influences acceptance. Studies have found clinicians prefer explainable outputs and the ability to review underlying signals.

Implementation also hinges on IT infrastructure (data connectivity, compute power) and interoperability (AI outputs feeding into hospital systems). Pilot implementations should involve end users to refine user experience and minimize “alert fatigue”. Moreover, cost benefit analyses and reimbursement models need consideration: e.g., will insurers cover AI based scans or triage? And to what extent? Early evidence suggests improved diagnostic yield (e.g., higher CHD detection) could justify costs.

Training and changing management attitude are key. Surveys show some obstetricians are skeptical of “black-box” AI [6]. Education on AI capabilities and limitations for clinicians and patients is important. ACOG and RCOG have called for continuing education on emerging technologies. Ultimately, successful integration will require demonstrating that AI tools improve patient outcomes or efficiency without disrupting care.

Future Research Directions

To fully realize AI’s potential in obstetrics, future work should address current limitations and explore new frontiers:

Data Sharing and Standards: The field needs large, diverse, high-quality datasets. Collaboration between national and international institutions (potentially via federated learning [29]) can aggregate data while preserving privacy. Standardization of data formats (e.g., DICOM for imaging, unified partographs) will help. Initiatives to create publicly available obstetric AI benchmarks (analogous to ImageNet in vision) could accelerate progress and enhance adoption.

Prospective Validation: Robust prospective and multi-center trials are needed. These should test AI tools in real clinical settings, measure health outcomes (e.g., detection rates, perinatal mortality) and monitor unintended effects. Regulators and the public are likely to demand such evidence.

Explainable and Multimodal AI: Advances in interpretable AI (e.g., prototype learning, attention maps) may build trust. Likewise, multimodal models that combine imaging, signals, EHR data and even genomics could improve predictions (e.g., integrating ultrasound and lab values for growth restriction). Research into LLMs for clinical support (beyond exam performance [13]) could enable smarter EMR query and patient communication tools.

Addressing Bias and Equity: Future models must be audited for fairness across race, socioeconomic and geographic groups. Techniques like subgroup calibration and bias mitigation need validation. Research on AI’s impact on health equity (does it reduce or exacerbate disparities?) is essential.

Regulatory and Ethical Frameworks: As AI becomes mainstream, clear guidelines are needed. Research into legal liability, consent models, and ethical AI use in pregnancy (e.g., limits on prenatal risk screening) should inform policy. Longitudinal studies on patient attitudes and psychosocial effects of AI (as warned by Ji et al. [31]) will be important.

Emerging Tech: The convergence of AI with other technologies (wearables, AR/VR, robotics) will shape the future. For example, AI guided robotics for precise ultrasound probe placement or automated fetal resuscitation support is conceivable. Immersive simulations (digital twins of mother fetus systems) are being explored for training.

Cost-effectiveness: Ultimately, health economic analyses should compare AI driven care pathways to standard care to justify adoption.

In summary, the medical community is conservative and sometimes reluctant to new modalities. Nevertheless, AI is being implemented in various fields. Ongoing interdisciplinary collaboration will be vital. Obstetricians, midwives, data scientists, ethicists and patients must work together to ensure AI innovations are safe, effective, and aligned with women’s needs.

References

Venturini, L.; Budd, S.; Farruggia, A.; et al. Whole-examination AI estimation of fetal biometrics from 20-week ultrasound scans. npj Digit Med. 2025, 8, 22. [Google Scholar] [PubMed]
Arnaout, R.; Curran, L.; Zhao, Y.; Levine, J.C.; Chinn, E.; Moon-Grady, A.J. An ensemble of neural networks provides expert-level prenatal detection of complex congenital heart disease. Nat. Med. 2021, 27, 882–891. [Google Scholar] [CrossRef] [PubMed]
Clarius Mobile Health. FDA Clears Clarius OB AI for Handheld Ultrasound: A Leap Forward in Prenatal Care. 2024. Available online: https://clarius.com/press/fda-clears-clarius-ob-ai-for-handheld-ultrasound-leap-forward-in-prenatal-care/.
Ranjbar, A.; Mehrnoush, V.; Darsareh, F.; et al. Machine learning models for predicting preeclampsia: A systematic review. BMC Pregnancy Childbirth 2024, 24, 6. [Google Scholar] [CrossRef] [PubMed]
Patel, N.; O’Brien, J.; Bunn, R.; Schanbacher, B.; Bauer, J.; Lam, G.K. Perinatal artificial intelligence in ultrasound (PAIR) study: Predicting delivery timing. J. Matern Fetal Neonatal Med. 2025, 38, 2532099. [Google Scholar] [CrossRef] [PubMed]
Topol, E.J. High-performance medicine: The convergence of human and artificial intelligence. Nat. Med. 2019, 25, 44–56. [Google Scholar] [CrossRef] [PubMed]
Rajkomar, A.; Dean, J.; Kohane, I. Machine learning in medicine. N. Engl. J. Med. 2019, 380, 1347–1358. [Google Scholar] [CrossRef] [PubMed]
Drukker, L.; Noble, J.A.; Papageorghiou, A.T. Introduction to artificial intelligence in ultrasound imaging in obstetrics and gynecology. Ultrasound Obstet. Gynecol. 2020, 56, 498–505. [Google Scholar] [CrossRef] [PubMed]
Xie, H.N.; Wang, N.; He, M.; et al. Using deep-learning algorithms to classify fetal brain ultrasound images as normal or abnormal. Ultrasound Obstet. Gynecol. 2020, 56, 637–646. [Google Scholar] [CrossRef]
Ultrasound AI Receives FDA de Novo Clearance for Delivery Date AI Technology. Ultrasound AI. Press Release. 2026. Available online: https://www.prnewswire.com/news-releases/ultrasound-ai-receives-fda-de-novo-clearance-for-delivery-date-ai-technology (accessed on 2 March 2026).
Page, M.J.; McKenzie, J.E.; Bossuyt, P.M.; et al. The PRISMA 2020 statement: An updated guideline for reporting systematic reviews. BMJ 2021, 372, n71. [Google Scholar] [CrossRef] [PubMed]
Venkatesh, K.K.; Strauss, R.A.; Grotegut, C.A.; et al. Machine learning and statistical models to predict postpartum hemorrhage. Obstet. Gynecol. 2020, 135, 935–944. [Google Scholar] [CrossRef] [PubMed]
Li, S.W.; Kemp, M.W.; Logan, S.J.S.; et al. ChatGPT outperforming human candidates in obstetrics and gynaecology postgraduate examinations. Am. J. Obstet. Gynecol. 2023, 229, 172–173. [Google Scholar] [CrossRef]
World Health Organization. Ethics and Governance of Artificial Intelligence for Health: WHO Guidance; World Health Organization: Geneva, Switzerland, 2021. [Google Scholar]
Pokaprakarn, T.; Prieto, J.C.; Price, J.T.; et al. AI estimation of gestational age from blind ultrasound sweeps in low-resource settings. NEJM Evid. 2022, 1, EVIDoa2100058. [Google Scholar] [CrossRef]
Palacio, M.; Bonet-Carne, E.; Cobo, T.; et al. Prediction of neonatal respiratory morbidity by quantitative ultrasound lung texture analysis: A multicentre study. Am. J. Obstet. Gynecol. 2017, 217, 196.e1–196.e14. [Google Scholar] [CrossRef] [PubMed]
Boddupally, K.; Thuraka, E.R. Artificial intelligence for prenatal chromosome analysis. Clin. Chim. Acta 2024, 552, 117669. [Google Scholar] [CrossRef] [PubMed]
Aeberhard, J.L.; Radan, A.P.; Soltani, R.A.; et al. Artificial intelligence and machine learning in cardiotocography: A scoping review. Eur. J. Obstet. Gynecol. Reprod. Biol. 2024, 281, 54–62. [Google Scholar] [CrossRef]
Suiwen, L.; Xiaodan, D.; Minrong, Y.; et al. Artificial intelligence-based prediction of fetal hypoxia: A multicenter model development and nationwide AI-human comparison. BMC Med. 2026, 24, 279. [Google Scholar] [CrossRef]
Abdelgadir, A.; Elhabeeb, E.; et al. Enhancing obstetric decision-making with AI: A systematic review of AI models for predicting mode of delivery. PLoS ONE 2024, 19, e0309129. [Google Scholar]
Lee, Y.; Kim, S.Y.; Park, H. Clinical utility assessment framework for machine learning-based fetal health classification in cardiotocography: An observational study. Obstet. Gynecol. Sci. 2026, 69, 119–127. [Google Scholar] [CrossRef] [PubMed]
Guedalia, J.; Lipschuetz, M.; Calderon-Margalit, R.; et al. Real-time data analysis using a machine learning model significantly improves prediction of successful vaginal deliveries. Am. J. Obstet. Gynecol. 2020, 223, 437.e1–437.e15. [Google Scholar] [CrossRef] [PubMed]
Lipschuetz, M.; Guedalia, J.; Rottenstreich, A.; et al. Prediction of vaginal birth after cesarean deliveries using machine learning. Am. J. Obstet. Gynecol. 2020, 222, 613.e1–613.e12. [Google Scholar] [CrossRef] [PubMed]
Shyu, I.L.; Liu, C.F.; Tsai, Y.C.; et al. Machine learning predictive system to predict the risk of developing pre-eclampsia. BMJ Health Care Inform. 2025, 32, e101151. [Google Scholar] [CrossRef] [PubMed]
Aziz, A.; Zork, N.; Aubey, J.J.; et al. Telehealth for high-risk pregnancies in the setting of the COVID-19 pandemic. Am. J. Perinatol. 2020, 37, 800–808. [Google Scholar] [CrossRef] [PubMed]
Fontalvo, H.R.; Florentino Rico, F.; Mario de la Puente, M.; et al. When telehealth fails rural communities: The 40% internet threshold that changes everything. Digit Health 2025, 11, 20552076251393407. [Google Scholar] [CrossRef]
Nouri, S.; Khoong, E.C.; Lyles, C.R.; Karliner, L. Addressing equity in telemedicine for chronic disease management during the COVID-19 pandemic. NEJM Catal. Innov. Care Deliv. 2020. [Google Scholar] [CrossRef]
Futoma, J.; Simons, M.; Panch, T.; Doshi-Velez, F.; Celi, L.A. The myth of generalizability in clinical research and machine learning in health care. Lancet Digit Health 2020, 2, e489–e492. [Google Scholar] [CrossRef] [PubMed]
Rieke, N.; Hancox, J.; Li, W.; et al. The future of digital health with federated learning. npj Digit Med. 2020, 3, 119. [Google Scholar] [CrossRef] [PubMed]
Lundberg, S.M.; Lee, S.I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process Syst. (NeurIPS) 2017, 30, 4765–4774. [Google Scholar]
Ji, Q.; Wang, M. AI-driven high-risk pregnancy prediction: Balancing early detection, anxiety, and discrimination in digital public health. Front Public Health 2026, 14, 1752484. [Google Scholar] [CrossRef] [PubMed]
Chen, I.Y.; Pierson, E.; Rose, S.; Joshi, S.; Ferryman, K.; Ghassemi, M. Ethical machine learning in healthcare. Annu Rev. Biomed. Data Sci. 2021, 4, 123–144. [Google Scholar] [CrossRef] [PubMed]
Medani, I.E.; Hakami, A.M.; Chourasia, U.H.; et al. Telemedicine in Obstetrics and Gynecology: A Scoping Review of Enhancing Access and Outcomes in Modern Healthcare. Healthcare 2025, 13, 2036. [Google Scholar] [CrossRef] [PubMed]
Martinelli, C.; Giordano, A.; Carnevale, V.; et al. The PERFORM Study: Artificial Intelligence Versus Human Residents in Cross-Sectional Obstetrics-Gynecology Scenarios Across Languages and Time Constraints. Mayo Clin. Proc. Digit Health 2025, 3, 100206. [Google Scholar] [CrossRef] [PubMed]

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.