Predictive Models and Artificial Intelligence for Risk Stratification in Emergency Abdominal Surgery: A Contemporary Review

Catalin Dumitru Cosma; Vlad Olimpiu Butiurca; Marian Botoncea; Dragoș Molnar; Calin Molnar

doi:10.20944/preprints202606.1200.v1

Submitted:

15 June 2026

Posted:

16 June 2026

You are already at the latest version

Abstract

Background/Objectives: Emergency abdominal surgery is associated with substantial morbidity and mortality due to disease severity, physiological instability, and limited opportunities for preoperative optimization. Accurate risk stratification is therefore essential for guiding clinical decision-making and resource allocation. This review aims to provide a contemporary overview of predictive models and artificial intelligence (AI) applications for risk stratification in emergency abdominal surgery. Methods: A narrative review of the literature was conducted using PubMed/MEDLINE, Scopus, and Web of Science databases. Studies addressing conventional risk assessment tools, predictive modeling, machine learning, deep learning, radiomics, explainable AI, and clinical implementation in emergency abdominal surgery were evaluated. Relevant publications concerning acute appendicitis, acute cholecystitis, intestinal obstruction, emergency laparotomy, abdominal sepsis, and acute mesenteric ischemia were included. Results: Conventional risk assessment systems, including ASA, APACHE II, SOFA, POSSUM, SORT, and NELA, remain widely used but are limited by static risk calculations and restricted adaptability. Recent advances in AI have enabled the development of machine learning and deep learning models capable of integrating complex clinical, laboratory, and imaging data to improve the prediction of disease severity, postoperative complications, and mortality. Disease-specific applications have demonstrated promising results in acute appendicitis, acute cholecystitis, intestinal obstruction, abdominal sepsis, and emergency laparotomy. Emerging technologies such as radiomics, computer vision, and explainable AI further enhance predictive performance and model interpretability. However, challenges related to external validation, algorithmic bias, transparency, regulatory compliance, and clinical integration remain significant barriers to widespread implementation. Conclusions: Artificial intelligence has the potential to significantly enhance risk stratification in emergency abdominal surgery by enabling more precise and individualized prediction of adverse outcomes. Although current evidence is encouraging, robust prospective validation and responsible clinical implementation are required before AI-driven predictive models can be routinely incorporated into emergency surgical practice.

Keywords:

emergency abdominal surgery

;

artificial intelligence

;

machine learning

;

predictive modeling

;

risk stratification

;

emergency laparotomy

;

explainable artificial intelligence

;

radiomics

;

clinical decision support

;

precision surgery

Subject:

Medicine and Pharmacology - Surgery

1. Introduction

1.1. Burden of Emergency Abdominal Surgery

Emergency abdominal surgery (EAS) is one of the most challenging areas of modern surgical practice. It involves a heterogeneous spectrum of conditions, including acute appendicitis, acute cholecystitis, intestinal obstruction, secondary peritonitis, and acute mesenteric ischemia. These pathologies often require emergent operative intervention and have significantly higher morbidity and mortality than elective surgical procedures. Perioperative care has improved, but emergency surgical patients still have worse outcomes than the average patient due to presenting later, having more severe disease, being physiologically deranged, and having multiple comorbidities [1,2,3,4,5,6,7,8,9,10].

Emergency abdominal surgery is a major global healthcare burden, accounting for large proportions of hospital admissions, intensive care unit utilization, and postoperative mortality. Large international studies have shown large variations in outcomes between different healthcare systems. Mortality rates after emergency laparotomy range from less than 5% in selected low-risk populations to more than 20% in elderly or physiologically compromised patients [1,7,9]. The growing incidence of the elderly population, frailty, heart disease, diabetes mellitus, and immunosuppression further complicates the perioperative management and contributes to adverse postoperative outcomes [2,5,8].

With elective surgery, optimization can usually be done before surgery, but emergency abdominal surgery often necessitates quick clinical decisions despite uncertainty. Surgeons must weigh the risks of operative intervention against the consequences of delayed treatment, considering multiple clinical, laboratory, and radiological variables in a limited amount of time. Thus, accurate risk stratification has become an important part of modern emergency surgical practice [3,4,6].

1.2. Why Risk Stratification Matters

The goal of risk stratification is to pinpoint patients at increased risk of postoperative complications, lengthy hospital stays, intensive care admission, and mortality. Reliable prediction of adverse outcomes facilitates tailored treatment planning, optimisation of perioperative resources, informed consent, and early initiation of targeted interventions. In addition, objective risk assessment helps in clinical decision-making by reducing between-practitioner variation and allowing for evidence-based prioritization of care [11,12,13,14,15,16,17,18,19,20]. Several traditional scoring systems have been designed to estimate perioperative risk in emergency surgical patients. The American Society of Anesthesiologists (ASA) Physical Status Classification, the Acute Physiology and Chronic Health Evaluation II (APACHE II), the Sequential Organ Failure Assessment (SOFA), the Physiological and Operative Severity Score for the Enumeration of Mortality and Morbidity (POSSUM), and the National Emergency Laparotomy Audit (NELA) risk model are commonly used tools [11,12,13,14,15,16,17,18,19,20]. These systems, although providing valuable prognostic information, are often limited in their predictive ability due to reliance on predefined variables, static risk assessment, and limited adaptability to different patient populations.

1.3. Emergence of Artificial Intelligence and Predictive Analytics

Emerging artificial intelligence (AI), machine learning (ML), and predictive analytics have opened up new avenues to improve risk assessment in emergency surgery. Unlike traditional statistical approaches, AI-based modeling techniques can work with large and complex data, identify nonlinear relationships between variables, and iteratively learn from data to improve prediction accuracy [21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40].

The application of AI in surgery has grown rapidly over the last decade and now includes perioperative risk prediction, prediction of post-operative complications, clinical decision support, computer vision, image analysis, and automated interpretation of electronic health records. Several studies have shown that machine learning algorithms might outperform traditional scoring systems for the prediction of mortality, complications, intensive care needs, and length of hospital stay [33,34,35,36,37,38,39,40]. Furthermore, new explainable AI methods have improved transparency and interpretability, addressing one of the main challenges to clinical adoption [81,82,83,84,85,86,87,88,89].

AI-based predictive models have been studied for different conditions in emergency abdominal surgery, including acute appendicitis, acute cholecystitis, intestinal obstruction, abdominal sepsis, and acute mesenteric ischemia. These approaches have shown promise in identifying patients at a high risk of disease progression, predicting operative complexity, and supporting individualized treatment strategies [41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80].

1.4. Aim of the Review

This contemporary review aims to provide a comprehensive overview of predictive models and artificial intelligence applications for risk stratification in emergency abdominal surgery. Particular emphasis is placed on conventional risk assessment tools, machine learning methodologies, disease-specific predictive models, explainable AI, clinical implementation challenges, and future directions toward precision emergency surgery. By synthesizing current evidence, this review seeks to highlight both the opportunities and limitations of AI-driven risk prediction and to identify priorities for future research and clinical translation.

2. Materials and Methods

2.1. Study Design

This study was conducted as a narrative contemporary review of the literature aimed at evaluating the current role of predictive models and artificial intelligence (AI) in risk stratification for emergency abdominal surgery. The review focused on both conventional risk assessment systems and emerging AI-based approaches, including machine learning, deep learning, radiomics, and explainable AI applications across major emergency abdominal surgical pathologies.

2.2. Literature Search Strategy

A comprehensive literature search was performed using the PubMed/MEDLINE, Scopus, and Web of Science databases. Additional relevant publications were identified through manual screening of reference lists from selected articles and key review papers. The search strategy combined Medical Subject Headings (MeSH) and free-text terms related to emergency surgery, predictive modeling, machine learning, and artificial intelligence. Representative search terms included:“emergency abdominal surgery”, “emergency laparotomy” ,“risk stratification”,“risk prediction”,“predictive model”,“machine learning”,“artificial intelligence”,“deep learning”,“clinical decision support”,“appendicitis”, “acute cholecystitis”,“intestinal obstruction”,“peritonitis”,“abdominal sepsis”,“mesenteric ischemia”

Search terms were combined using Boolean operators (“AND”, “OR”) to maximize sensitivity and identify studies relevant to both traditional and AI-driven risk assessment.

2.3. Eligibility Criteria

Articles were considered eligible if they met one or more of the following criteria:

Investigated risk prediction or risk stratification in emergency abdominal surgery.

Evaluated conventional surgical risk assessment tools, including ASA, APACHE II, SOFA, qSOFA, POSSUM, P-POSSUM, SORT, or NELA.
Reported the development, validation, or implementation of predictive models using statistical methods, machine learning, or artificial intelligence.
Examined AI applications in acute appendicitis, acute cholecystitis, intestinal obstruction, emergency laparotomy, secondary peritonitis, abdominal sepsis, or acute mesenteric ischemia.
Addressed explainable AI, radiomics, clinical implementation, ethical considerations, or regulatory aspects of AI in healthcare.

The review included original research articles, prospective and retrospective cohort studies, systematic reviews, meta-analyses, consensus statements, clinical guidelines, and methodological papers. Publications not available in English, conference abstracts lacking sufficient methodological details, editorials, commentaries, and studies unrelated to emergency abdominal surgery were excluded.

2.4. Study Selection and Data Synthesis

The retrieved literature was screened for relevance based on title, abstract, and full-text evaluation. Particular emphasis was placed on highly cited landmark publications, international guidelines, externally validated predictive models, and recent studies published between 2020 and 2026 that reflected contemporary developments in surgical artificial intelligence. The selected evidence was synthesized narratively and organized into five thematic domains:

Conventional risk stratification systems in emergency surgery.
Predictive modeling and artificial intelligence methodologies.
Disease-specific applications of AI in emergency abdominal surgery.
Explainable AI, radiomics, and clinical implementation.
Ethical, regulatory, and future perspectives.

Given the substantial heterogeneity in study design, patient populations, predictive variables, outcomes, and machine learning methodologies, a quantitative meta-analysis was not performed. Instead, findings were summarized descriptively to provide a clinically oriented overview of the current evidence and future directions for AI-assisted risk prediction in emergency abdominal surgery.

3. Results

3.1. Conventional Risk Stratification Models in Emergency Abdominal Surgery

Correct risk stratification remains a core element of emergency abdominal surgery and offers physicians an objective way to estimate perioperative morbidity, mortality, the need for intensive care, and postoperative complications. Patients presenting with acute abdominal conditions are often highly heterogeneous in terms of age, comorbidities, physiological reserve, disease severity, and operative urgency. Therefore, prediction models with good reliability are needed to help surgical decision-making, optimize perioperative management, support informed consent, and allocate healthcare resources effectively [11,12,13,14,15,16,17,18,19,20].

During the last few decades, many systems for the assessment of surgical risk have been developed. One of the most common is the American Society of Anesthesiologists (ASA) Physical Status Classification, which classifies patients based on their general preoperative health status. Although simple and universally applicable, the predictive performance of the ASA classification is limited by interobserver variability and the absence of procedure-specific variables [11].

The Acute Physiology and Chronic Health Evaluation II (APACHE II) score was developed for critically ill patients and includes physiological measurements, laboratory parameters, age, and chronic health conditions. It has been extensively studied and has been shown to predict mortality in critically ill surgical patients. However, its complexity and the need for extensive physiologic data may limit its routine use in emergency surgical settings [12].

Likewise, the Sequential Organ Failure Assessment (SOFA) score and its simplified form, the quick SOFA (qSOFA), have emerged as valuable tools for evaluating organ dysfunction and identifying patients at risk for sepsis-related adverse outcomes. These systems are particularly relevant in the emergency abdominal conditions complicated by peritonitis, abdominal sepsis, or septic shock, where rapid identification of physiological deterioration is essential [13,14].

Also, procedure-specific models have been developed for improving surgical risk prediction. The Physiological and Operative Severity Score for the Enumeration of Mortality and Morbidity (POSSUM) and the Portsmouth modification (P-POSSUM) are scores combining physiological and operative variables to predict postoperative morbidity and mortality. These models have been extensively validated in various surgical populations and are still one of the most prevalent tools in emergency surgery research [15,16].

More recently, contemporary risk assessment systems like the Surgical Outcome Risk Tool (SORT) and the National Emergency Laparotomy Audit (NELA) risk model have shown improved predictive performance in patients undergoing emergency laparotomy. The NELA model, developed using a large national cohort of emergency laparotomy patients, contains demographic, physiological, biochemical, and operative variables to estimate 30-day mortality risk [17,18,19]. The usefulness of these models for perioperative risk assessment has been confirmed by systematic reviews but the variability of the predictive performance in different healthcare systems and patient populations has been pointed out [20].

Traditional risk assessment tools, although widely used, have several important limitations. Most models are built on a set of predefined variables and linear statistical relationships and therefore, might overlook complex interactions between clinical, laboratory, radiological, and operative factors. Moreover, many scores were developed based on historical data and may not sufficiently represent changes in patient demographics, perioperative care pathways, and surgical practice [17,18,19,20]. Their static nature also limits the ability to continuously incorporate evolving clinical information throughout the hospital course of the patient.

These limitations have stimulated growing interest in predictive analytics and artificial intelligence-based approaches capable of integrating large-scale multidimensional datasets and identifying nonlinear relationships that may improve risk prediction. Consequently, machine learning and other AI methodologies are increasingly being investigated as potential alternatives or complements to conventional risk stratification systems in emergency abdominal surgery.[Table I]

3.2. Predictive Modeling and Artificial Intelligence in Surgery

The increasing availability of electronic health records, advanced medical imaging, and large-scale clinical datasets has led to an increased use of predictive modeling and artificial intelligence (AI) across multiple surgical specialties. In contrast to traditional risk assessment systems based on predetermined variables and linear statistical relationships, AI-based systems can analyze complex multidimensional datasets and detect nonlinear interactions that may have a substantial impact on clinical outcomes [21,22,23,24,25,26,27,28,29,30,31,32].

Predictive modeling is a wide spectrum of statistical and computational methods aimed at estimating the probability of future clinical events. Traditional predictive models rely on logistic regression and multivariable statistical analysis, while current AI-based models make use of machine learning (ML) and deep learning (DL) algorithms to enhance predictive accuracy and flexibility. These methods can combine demographic features, physiological parameters, laboratory results, imaging information, operative variables, and longitudinal clinical information into a single predictive framework [21,22,23,24,25,26,27,28,29].

Machine learning is the most prevalent branch of AI applied to surgical outcome prediction. Supervised learning algorithms, such as logistic regression, decision trees, random forests, support vector machines, gradient boosting models, and extreme gradient boosting (XGBoost), are trained using labeled datasets to find patterns associated with predefined outcomes (eg, postoperative complications, mortality, intensive care admission, or prolonged hospitalization). These models can detect complex interactions among variables that may not be apparent using traditional statistical approaches [25,26,27,28,29] by learning from large datasets.

Deep learning is a more advanced subset of machine learning that uses multilayer artificial neural networks to process large amounts of structured and unstructured data. Deep learning models have shown particular usefulness in medical image analysis, computer vision, radiomics, and automated interpretation of surgical videos. Recent advances allowed for automated recognition of anatomical structures, intraoperative phase identification, prediction of operative complexity, and assessment of surgical performance, highlighting the expanding role of AI throughout the perioperative continuum [26,28,33].

Predictive modeling has been increasingly applied in surgery to predict perioperative risk, postoperative morbidity, mortality, readmission rates, and health-care resource utilization. Machine learning algorithms have been shown in several studies to be able to outperform traditional risk assessment tools by incorporating larger numbers of variables and adapting to changing clinical data. Notably, AI-driven systems like MySurgeryRisk and other machine learning–based perioperative prediction platforms have demonstrated promising results for personalized risk assessment in various surgical populations [34,35,36,37,38,39,40].

The fast growth of AI in healthcare has also highlighted the necessity for standardized reporting and methodological rigor. To promote transparency, reproducibility, and clinical utility, a variety of reporting frameworks have been developed, such as the Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis (TRIPOD) statement, the Prediction Model Risk of Bias Assesment Tool (PROBAST), the Consolidated Standards of Reporting Trials–Artificial Intelligence (CONSORT-AI), the Standard Protocol Items: Recommendations for Inter-ventional Trials–Artificial Intelligence (SPIRIT-AI), and the Developmental and Exploratory Clinical Investigations of Decision Support Systems Driven by Artificial Intelligence (DECIDE-AI) guidelines [21,22,23,30,31,32]. These initiatives are designed to set methodological standards for the development, validation, reporting, and clinical implementation of AI-based predictive models.

Although the potential of predictive analytics is large, important challenges remain. Many published AI models have been built based on retrospective single-center datasets, thus raising concerns about generalizability, external validation, algorithmic bias, and reproducibility. In addition, the “black-box” nature of some machine learning algorithms may limit clinician trust and impede widespread adoption in routine practice. Therefore, recent research has been increasingly concerned with explainable AI approaches that can deliver transparent and clinically interpretable predictions [30,31,32].

Overall, artificial intelligence and predictive modeling represent a paradigm shift in surgical risk assessment. Using big clinical data and sophisticated computational methods, these technologies hold promise to enhance patient stratification, inform clinical decision making, and enable the shift to precision emergency surgery. [Figure 1]

3.3. Disease-Specific Applications of AI in Emergency Abdominal Surgery

Artificial intelligence has shown promising potential in a wide spectrum of emergency abdominal surgical conditions. By integrating clinical, laboratory, radiological, and operative variables, AI-based predictive models can assist in earlier diagnosis, risk stratification, and personalized treatment decisions. Although the maturity of the available evidence varies across disease entities, current studies suggest that machine learning algorithms often achieve predictive performance comparable to, or even better than, traditional scoring systems [41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78].

3.3.1. Acute Appendicitis

One of the most studied emergency surgical conditions in the context of predictive modelling research is acute appendicitis. Conventional diagnostic systems such as the Alvarado Score, Appendicitis Inflammatory Response (AIR) Score, and Adult Appendicitis Score have improved diagnostic standardization but remain limited by moderate sensitivity and specificity, particularly in atypical presentations and complicated disease [41,42,43,44]. More recent machine learning approaches have integrated demographic features, laboratory biomarkers, imaging findings, and clinical symptoms to improve diagnostic accuracy and identify patients with a higher risk of complicated appendicitis. Several studies have shown that machine learning algorithms can effectively predict perforation, gangrene, abscess formation, and post-operative complications and can decrease the negative appendectomy rates [45,46,47,48,49,50]. Furthermore, the application of explainable AI models has shown promise for producing transparent predictions, which could lead to increased acceptance by clinicians and assist in surgical decision-making.

3.3.2. Acute Cholecystitis

Predictive analytics and artificial intelligence applications have been increasingly helpful in the management of acute cholecystitis. The Tokyo Guidelines still provide the basis of disease classification and assessment of severity, but new AI models have been developed to predict severe inflammation, gangrenous cholecystitis, difficult laparoscopic cholecystectomy, conversion to open surgery, and perioperative complications [51,52,53,54,55,56,57,58,59,60]. Machine learning algorithms combining clinical variables, inflammatory markers, and imaging features have shown promising predictive performance to identify high-risk patients pre-operatively. Also, progress has been made in the field of computer vision and analysis of surgical videos, making possible the automated evaluation of intraoperative anatomy and recognition of the Critical View of Safety in laparoscopic cholecystectomy. These technologies could contribute to the improvement of operative safety and the reduction of bile duct injuries in the future [54,55,56,57,58,59,60].

3.3.3. Intestinal Obstruction and Emergency Laparotomy

Patients presenting with intestinal obstruction or for emergency laparotomy represent one of the highest risk groups in general surgery. Failure to recognize bowel ischemia, strangulation, or physiologic deterioration in a timely fashion can significantly increase postoperative morbidity and mortality. In this setting, it is therefore particularly important to accurately predict disease progression and operative outcomes [61,62,63,64,65,66,67,68,69,70]. Machine learning models have been developed to predict patients who are likely to fail conservative management, need emergency surgery, or develop bowel ischemia. Other predictive algorithms have been developed to estimate postoperative mortality, the need for intensive care, and major complications after emergency laparotomy. Several studies have reported improved discrimination compared with conventional risk assessment tools, underlining the potential value of AI-assisted perioperative risk stratification in complex emergency surgical patients [65,66,67,68,69].

3.3.4. Secondary Peritonitis, Abdominal Sepsis, and Acute Mesenteric Ischemia

Secondary peritonitis, abdominal sepsis, and acute mesenteric ischemia are among the most severe conditions faced in emergency abdominal surgery and remain associated with large mortality despite advances in critical care and source control strategies [71,72,73,74,75,76,77,78].

Recent advances in predictive analytics have sought to identify patients at risk of septic deterioration, organ failure, intestinal necrosis, and death early on. Machine learning approaches have been used on complex physiological and laboratory data sets with improved prediction of adverse outcomes over traditional rule-based systems. Predictive models based on clinical findings, laboratory parameters, and imaging features have shown promise in identifying patients with irreversible intestinal ischemia who may benefit from expedited surgical intervention in acute mesenteric ischemia [76,77,78].

While many of these disease-specific applications are still in various stages of development and validation, the available evidence taken together suggests that AI-based risk prediction can enhance clinical decision-making across the spectrum of emergency abdominal surgery. Future multicenter validation studies and prospective clinical implementation trials are needed to assess the impact on patient outcomes and healthcare resource utilization.

Table 2. Representative Artificial Intelligence Models in Emergency Abdominal Surgery.

Study	Pathology	Dataset/Population	AI Method	Predicted Outcome	Performance
Kim et al. (2023) [45]	Acute Appendicitis	Systematic review and meta-analysis	Multiple ML models	Diagnosis of acute appendicitis	Pooled diagnostic accuracy superior to conventional scores
Byun et al. (2023) [46]	Acute Appendicitis	Pediatric patients with CT, laboratory and clinical data	Machine Learning	Complicated appendicitis	AUC > 0.85
Eickhoff et al. (2022) [47]	Perforated Appendicitis	Surgical cohort	Machine Learning	Postoperative complications	Improved risk discrimination compared with conventional variables
Wei et al. (2024) [48]	Acute Appendicitis	Clinical dataset	Machine Learning	Complicated appendicitis	AUC > 0.80
Schipper et al. (2024) [49]	Acute Abdominal Pain	Emergency department cohort	Machine Learning	Appendicitis diagnosis	High diagnostic accuracy and reduction of diagnostic uncertainty
Males et al. (2024) [50]	Acute Appendicitis	Prospective validation cohort	Explainable Machine Learning	Negative appendectomy reduction	Improved clinical decision support
Ward et al. (2022) [55]	Acute Cholecystitis	Laparoscopic cholecystectomy videos	Deep Learning / Computer Vision	Operative difficulty prediction	Accurate identification of severe inflammation
Mascagni et al. (2023) [54]	Acute Cholecystitis	Intraoperative video dataset	Artificial Intelligence	Critical View of Safety Recognition	High automated recognition accuracy
Hu et al. (2025) [57]	Acute Cholecystitis	Clinical and laboratory dataset	Explainable Machine Learning	Gangrenous cholecystitis	AUC > 0.85
Sun et al. (2025) [58]	Acute Cholecystitis	CT-based radiomics cohort	Radiomics + Machine Learning	Difficult laparoscopic cholecystectomy	Excellent discrimination performance
Cicerone et al. (2026) [59]	Acute Cholecystitis	Multicenter cohort	Machine Learning	Perioperative risk stratification	Improved prediction of adverse outcomes
Chen et al. (2025) [60]	Acute Cholecystitis	CT imaging dataset	Deep Learning	Suppurative cholecystitis	High diagnostic accuracy
Zielinski et al. (2010) [62]	Small Bowel Obstruction	Surgical cohort	Predictive statistical model	Need for surgery	Early identification of operative candidates
Schwenter et al. (2010) [63]	Small Bowel Obstruction	Clinical-radiological cohort	Prediction model	Strangulation risk	Good predictive discrimination
Mathiszig-Lee et al. (2022) [65]	Emergency Laparotomy	National cohort	Machine Learning	Mortality prediction	Improved uncertainty quantification
Mazzotta et al. (2024) [66]	Bowel Obstruction	Emergency surgery cohort	Machine Learning	Major postoperative complications	Superior risk prediction compared with traditional models
Jones et al. (2025) [67]	Emergency Laparotomy	ANZELA-QI database	Machine Learning	Mortality and major complications	Enhanced perioperative risk stratification
Soliman et al. (2025) [68]	Emergency Laparotomy	External validation cohort	Predictive model validation	Postoperative mortality	Successful external validation
Yuan et al. (2025) [69]	Abdominal Surgery	Multicenter surgical dataset	Machine Learning	Postoperative mortality	High predictive performance
Seymour et al. (2019) [73]	Abdominal Sepsis	Large sepsis cohort	Predictive phenotyping model	Sepsis subtypes and outcomes	Clinically relevant risk phenotypes
Komorowski et al. (2018) [74]	Sepsis	Intensive care database	Reinforcement Learning	Treatment optimization and mortality	AI-assisted therapeutic decision support

Representative studies illustrating the application of artificial intelligence, machine learning, deep learning, radiomics, and predictive analytics across major emergency abdominal surgical pathologies. These models target clinically relevant outcomes, including diagnosis, disease severity, operative complexity, postoperative complications, mortality, and treatment optimization. Collectively, they demonstrate the growing role of AI-driven risk stratification in supporting precision emergency surgery.

3.4. Explainable AI, Radiomics, and Clinical Implementation

Artificial intelligence (AI) has shown promising ability to improve risk stratification in emergency abdominal surgery, yet successful clinical adoption of AI models depends not only on their predictive performance, but also on their transparency, interpretability, and incorporation into daily surgical workflows. Therefore, explainable artificial intelligence (XAI), radiomics, and implementation science have emerged as important fields of research to bridge the gap between algorithm development and clinical real-world application [81,82,83,84,85,86,87,88,89,90,91,92,93,94,95].

3.4.1. Explainable Artificial Intelligence

One of the main limitations of many machine learning algorithms is that they are seen as a “black box”. Although very complex models may perform well in terms of predictive performance, clinicians may be reluctant to trust recommendations that are not easily interpretable or explainable. This is a challenge of relevance, especially in emergency surgery, where treatment decisions often have serious consequences and need to be made under substantial time pressure [81,82,83,84,85].To address these concerns, explainable AI techniques have been developed to provide insight into the factors that influence model predictions. Methods such as Shapley Additive exPlanations (SHAP) and Local Interpretable Model-agnostic Explanations (LIME) are commonly used to estimate the contribution of each variable to a specific prediction [81,82]. These methods assist clinicians in identifying the most important clinical, laboratory, or imaging findings associated with increased risk. This enhances transparency and builds confidence in AI-assisted decision-making. Explainable AI may be especially useful in emergency abdominal surgery, where the rationale for a risk prediction may be as important as the prediction itself. XAI systems can help to support clinician supervision, increase adoption, and address concerns about algorithmic bias and unintended errors by providing insights into the variables that drive a model’s output [83,84,85,86,87,88,89].

3.4.2. Radiomics-Based Prediction Models

Radiomics is an emerging field that converts routine medical images into high-dimensional quantitative data ready for sophisticated computational analysis. Radiomics can identify patterns not apparent to the human observer through automated extraction of imaging features related to shape, texture, intensity, and spatial relationships [90,91]. Promising applications of radiomics in emergency abdominal surgery have been reported in acute appendicitis, acute cholecystitis, intestinal obstruction, and mesenteric ischaemia. Computed tomography scans can be used to extract radiomic features that can be integrated with clinical and laboratory variables to improve the prediction of disease severity, operative difficulty, postoperative complications, and treatment response [58,90,91]. Given the critical role of imaging in emergency surgical decision-making, radiomics may become a more important component of future multimodal predictive models.

3.4.3. Clinical Decision Support Systems

The goal of predictive analytics is not only to produce accurate estimates of risks, but also to enhance clinical decision-making and improve patient outcomes. Clinical decision support systems (CDSSs) embed predictive algorithms into healthcare environments to provide real-time recommendations based on patient-specific data [83,86,87]. Modern decision support platforms powered by artificial intelligence can continuously analyze electronic health records, laboratory data, physiological parameters, and imaging data to identify high-risk patients and alert physicians to a potential deterioration. In emergency surgery, such systems could assist in the early recognition of sepsis, bowel ischemia, complicated appendicitis, or severe cholecystitis, supporting timely intervention and optimized resource allocation [83,86]. Despite promising results, however, widespread implementation is limited. Many AI systems have only been validated retrospectively, and relatively few have been prospectively evaluated in routine clinical practice. More research is needed to determine their efficacy, safety, and impact on patient outcomes before widespread implementation [84,85,86,87,88].

3.4.4. Integration into Surgical Workflows

The successful implementation of AI into emergency abdominal surgery relies on seamless integration into existing clinical workflows. Predictive systems should be accurate, explainable, easy to use, and provide actionable information without interrupting established clinical workflows. Integration with electronic health records and hospital information systems is particularly important to minimize extra workload and to ensure the timely availability of predictions [83,84,85,86,87,88,89]. Furthermore, external validation in different healthcare systems is needed for generalizability and reliability. Differences in patient populations, institutional practices, data quality, and healthcare infrastructure may have a large impact on model performance. Future efforts should therefore be focused on multicenter prospective validation studies, standardized reporting practices, and continuous monitoring of deployed algorithms [21,22,23,83,84,85,86,87,88,89]. Explainable AI, radiomics, and implementation science are all crucial elements of the move from experimental predictive models to clinically usable decision support tools. Their further development will be of fundamental importance to unleash the full potential of artificial intelligence for precision emergency surgery. [Figure 2]

3.5. Ethical, Regulatory, and Implementation Challenges

Despite the growing enthusiasm surrounding artificial intelligence (AI) and predictive analytics in healthcare, several important barriers continue to limit their widespread adoption in emergency abdominal surgery. While many AI-based models have demonstrated promising predictive performance, successful clinical implementation requires careful consideration of ethical, regulatory, technical, and organizational factors. Addressing these challenges is essential to ensure that AI systems are safe, reliable, transparent, and capable of generating meaningful improvements in patient outcomes [83,84,85,86,87,88,89,90,91,92,93,94,95].

3.5.1. Data Quality and External Validation

The quality of data used for model development and validation is a basic factor for the performance of any predictive model. There are concerns about selection bias, incomplete data capture, and limited generalizability, as many of the available AI systems have been trained on retrospective datasets from single institutions [83,84,85,86,87,88,89]. Differences in patient population demographics, disease severity, healthcare infrastructure, and clinical practice patterns may have a substantial impact on the performance of models when deployed in an external environment. Hence, algorithms that show good predictive performance in one institution may have much worse performance in other populations. Thus, strong external validation based on multi-center datasets is necessary before clinical implementation can be contemplated [21,22,23,83,87].

3.5.2. Algorithm Transparency and Interpretability

One of the most commonly cited concerns regarding AI implementation in medicine is the lack of interpretability of complex machine learning algorithms. Many high-performing models are “black boxes” — they give you a prediction but don’t have a good understanding of how each variable affects the final output. The lack of transparency may decrease clinician trust and acceptance in high-stakes clinical environments such as emergency surgery [81,82,83,84,85]. In recent years, explainable AI approaches such as SHAP and LIME methodologies have emerged as important tools to improve interpretability and provide insight into model behavior. These techniques allow clinicians to supervise and make informed decisions by identifying the drivers of a specific prediction while maintaining the predictive power of advanced machine learning systems [81,82].

3.5.3. Algorithmic Bias and Fairness

Algorithmic bias represents another major challenge for clinical AI systems. Predictive models trained on datasets that inadequately represent certain demographic or clinical subgroups may inadvertently generate unequal performance across populations. Such biases may affect predictions related to age, sex, ethnicity, socioeconomic status, or comorbidity burden, potentially contributing to disparities in healthcare delivery [83,85]. To minimize these risks, future AI development should emphasize representative datasets, transparent reporting of model performance across subgroups, and continuous post-implementation monitoring. Fairness assessments should become a routine component of model validation and regulatory evaluation before clinical deployment [85,88].

3.5.4. Regulatory Frameworks and Governance

The rapid expansion of AI technologies has prompted the development of new regulatory frameworks intended to ensure patient safety and promote responsible innovation. In Europe, the Artificial Intelligence Act establishes a risk-based regulatory approach for AI systems, including those used in healthcare. The legislation introduces requirements related to transparency, documentation, human oversight, cybersecurity, and post-market surveillance [92]. Similarly, the World Health Organization has emphasized the importance of ethical governance, transparency, accountability, and human-centered design in healthcare AI applications. These recommendations highlight the necessity of maintaining clinician oversight and ensuring that AI systems augment rather than replace professional judgment [93].

3.5.5. Barriers to Clinical Adoption

Apart from the technical and regulatory questions, practical implementation challenges remain big. Successful deployment is critically dependent on integration with electronic health records, interoperability between hospital information systems, workflow compatibility, and clinician training [83,86]. In addition, prospective evidence for real-world clinical utility remains relatively sparse. Many studies published report excellent predictive performance in experimental conditions, but little information about their impact on decision-making, healthcare costs, or patient outcomes. Therefore, additional prospective multicenter implementation studies are needed to evaluate the real clinical impact of AI-assisted risk stratification in emergency abdominal surgery [84,85,86,87,88,89]. These challenges are real, but surmountable. Ongoing progress in explainable AI, regulatory oversight, data governance, and prospective validation is anticipated to facilitate the gradual integration of AI technologies into routine emergency surgical practice. Ultimately, successful implementation will depend on striking the right balance between technological innovation, clinical expertise, patient safety, and ethical responsibility. [Table 3]

3.6. Future Perspectives

The rapid development of artificial intelligence is expected to change the risk stratification and clinical decision-making in emergency abdominal surgery fundamentally. Current predictive models mainly focus on the estimation of isolated outcomes such as mortality, complications, or disease severity, but future systems are likely to incorporate multiple data sources and produce dynamic, patient-specific predictions across the whole perioperative pathway [81,82,83,84,85,86,87,88,89,90,91,92,93,94,95]. One of the most exciting developments is the arrival of predictive analytics in real time. Next-generation AI platforms will not rely on traditional static risk scores but will instead continuously monitor and analyze clinical, laboratory, physiological, and imaging data, enabling dynamic re-evaluation of a patient’s risk as new information becomes available. Such systems may facilitate earlier recognition of clinical deterioration, optimize surgical timing, and support individualized perioperative management strategies [83,84,85,86,87]. Another important area of future development is multimodal predictive models. AI algorithms that use electronic health records, laboratory biomarkers, radiological imaging, operative findings, and data from wearable devices may have much greater predictive accuracy than models that rely on a single data source. The combination of radiomics and computer vision technologies is especially interesting in the field of emergency abdominal surgery, where imaging plays a central role in diagnosis and treatment planning [58,90,91].

Recent advances in computer vision also show significant promise for intraoperative decision support. Improved surgical safety and standardization of care can be advanced by automated recognition of anatomical structures, assessment of operative difficulty, identification of critical surgical landmarks, and real-time guidance during laparoscopic procedures. Future systems could provide surgeons with continuously updated risk assessments throughout the operation, enabling more informed intraoperative decision-making [54,55]. Federated learning has the potential to accelerate the development of robust predictive models by allowing collaborative training across multiple institutions without the need for direct sharing of patient-level data. This approach is able to address many issues related to privacy, security, and data governance while increasing dataset diversity and improving model generalizability [85,86,87,88]. The long-term aim of these technological advances is the development of precision emergency surgery, with individualized risk assessment to steer each step in the patient's management. In this context, AI systems would assist in diagnosis, surgical planning, intra-operative decision making, post-operative monitoring, and prevention of complications, continuously integrating multi-modal clinical information. These technologies are expected to be decision-support tools that support clinical expertise and patient-centered care, not replace surgeons [84,85,86,87,88,89,90,91,92,93,94,95]. However, these promising developments will require rigorous prospective validation, transparent reporting, regulatory oversight, and careful evaluation of clinical effectiveness before successful translation into routine practice. Future research should therefore not only enhance predictive performance but also show measurable benefits in terms of patient outcomes, healthcare efficiency, and resource utilization.

Artificial intelligence will probably play an ever more significant role in emergency surgical care. With the ongoing evolution and maturation of predictive models, the transition from population-based risk estimation to truly personalized and data-driven emergency surgery may be possible.

4. Discussion

This review emphasises the rapidly increasing role of predictive modelling and artificial intelligence (AI) in risk stratification in emergency abdominal surgery. Since emergency surgical patients often present with advanced disease, physiological instability, and multiple comorbidities, accurate prediction of adverse outcomes remains a key component of perioperative decision making. While traditional risk assessment tools continue to provide useful prognostic information, promising evidence suggests that AI-based predictive models may be advantageous by capturing complex nonlinear relationships among clinical variables and generating more individualized risk estimates [11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40].

For decades, traditional scoring systems such as ASA, APACHE II, SOFA, POSSUM, P-POSSUM, SORT, and NELA have played an important role in perioperative risk assessment. The tools are still popular because they are relatively simple, have well-established validation, and are easy to interpret. However, many traditional models are based on pre-determined variables and static statistical models, which may not adequately capture the dynamic and multifactorial nature of emergency surgical disease. In addition, the predictive performance is frequently heterogeneous across patient populations, health care systems, and disease entities, motivating the need for more flexible approaches [11,12,13,14,15,16,17,18,19,20].

Recent advances in machine learning and predictive analytics have introduced new opportunities to enhance risk stratification in emergency surgery. AI algorithms are able to handle large amounts of heterogeneous data, such as demographic information, laboratory parameters, imaging findings, operative variables, and longitudinal clinical records, which are difficult to handle by conventional models. In this manuscript, several studies reviewed have shown that machine learning models often achieve better discrimination compared with traditional scoring systems in predicting mortality, postoperative complications, disease severity, and resource utilization [33,34,35,36,37,38,39,40]. These findings support the increasing belief that AI could supplement—and eventually replace, in some cases—traditional risk assessment tools.

The most advanced AI research fields at present are acute appendicitis and acute cholecystitis in emergency abdominal pathologies. Machine learning models have shown promising results in the detection of complicated appendicitis, prediction of perforation, reduction of negative appendectomy rate, and estimation of operative difficulty during cholecystectomy [41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60]. Similarly, advances in computer vision have allowed for automated detection of key surgical landmarks and intraoperative anatomy, illustrating the potential of AI not only for preoperative risk assessment but also for intraoperative decision support. These trends are especially relevant in the context of the growing adoption of minimally invasive surgical techniques and digital operating room technologies.

Applications of AI in intestinal obstruction, emergency laparotomy, abdominal sepsis, and acute mesenteric ischemia are relatively less developed but promising. These conditions are often characterized by high mortality and complex decision-making, and are good candidates for advanced predictive modeling. New studies indicate that machine learning algorithms can be useful in identifying patients that require urgent surgical intervention, predicting postoperative complications and mortality risk more accurately than conventional methods [61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78]. However, evidence is limited by relatively small datasets, retrospective study designs, and a lack of large-scale prospective validation.

One of the main themes derived from the existing literature is the increasing need for explainability and clinical interpretability. Historically, the “black-box” nature of AI algorithms has raised concerns and limited clinician confidence, slowing the pace of implementation efforts. Explainable AI techniques, including SHAP and LIME, are important advances in gaining insight into the variables driving model predictions and enabling clinicians to better understand algorithmic outputs [81,82]. In high-stakes settings, such as emergency surgery, where treatment decisions can have immediate and potentially life-threatening consequences, transparency may be as critical as predictive accuracy itself.

Radiomics and multimodal predictive modeling are other areas of significant future interest. The ability to extract quantitative information from routine imaging studies and to combine that information with clinical and laboratory variables provides a means of more comprehensive risk assessment. Given the central role of computed tomography in the diagnosis and management of many emergency abdominal conditions [90,91], radiomics approaches may become increasingly valuable components of future predictive systems. Progress has been encouraging, but several key challenges still exist that limit widespread adoption. Most current AI models are built on retrospective single-center datasets and have not been rigorously externally validated. Algorithmic bias, missing data, differences in health care infrastructure, and limited interoperability between electronic health record systems continue to be major barriers to implementation [83,84,85,86,87,88,89]. Moreover, although many studies report excellent predictive performance, relatively few have shown meaningful improvements in clinical outcomes after real-world deployment. This gap points to the need to move beyond model development to prospective implementation studies to evaluate patient-centered outcomes.

The regulatory landscape for healthcare AI is also evolving rapidly. The European Artificial Intelligence Act and the growing emphasis on ethical governance by international organizations highlight the importance of transparency, accountability, and human oversight in clinical AI systems [92,93]. Thus, for routine clinical implementation, future predictive models must meet not only technical performance criteria but also ethical and regulatory requirements. The present review has several limitations that should be acknowledged. First, the available literature is highly heterogeneous in terms of study design, patient populations, predictive variables, outcome measures, and machine learning methodologies. Second, many AI studies are from single institutions and therefore have limited generalizability. Third, the field is evolving rapidly, and new predictive models and validation studies are likely to be published soon after publication. However, this review provides a comprehensive overview of the current state of AI-assisted risk stratification in emergency abdominal surgery, incorporating evidence from traditional risk assessment systems, predictive analytics, machine learning, radiomics, and explainable AI.

The available evidence suggests, at present, that artificial intelligence-based predictive models are able to significantly improve risk assessment and clinical decision-making in emergency abdominal surgery. But translating them successfully into routine practice will need strong multicenter validation, explainability, seamless workflow integration, and demonstration of tangible benefits for patients and healthcare systems. The future of emergency surgery will probably be characterized by a growing collaboration between clinicians and intelligent decision support systems, with the ultimate aim of more accurate, personalized, and effective surgical treatment.

5. Conclusions

Artificial intelligence and predictive modeling are rapidly changing the landscape of risk stratification in emergency abdominal surgery. Compared with conventional scoring systems, AI-based methods can incorporate complex clinical, laboratory, and imaging data to provide more individualized risk assessment and improve the prediction of disease severity, postoperative complications, and mortality. Promising applications are suggested by current evidence in acute appendicitis, acute cholecystitis, intestinal obstruction, abdominal sepsis, and emergency laparotomy. However, important challenges remain, including the need for external validation, improved interpretability, regulatory compliance, and seamless integration into clinical workflows. AI should be viewed as a complementary tool that augments, not replaces, clinical judgment. Further validation and responsible implementation of AI-driven predictive models could aid precision emergency surgery and improve outcomes for patients undergoing emergency abdominal procedures.

Author Contributions

Conceptualization, C.C.-D. and C.M.; methodology, C.C.-D. and V.-O.B.; literature search and data curation, C.C.-D.; investigation, C.C.-D., V.-O.B., M.B., and D.M.; writing—original draft preparation, C.C.-D.; writing—review and editing, V.-O.B., M.B., D.M., and C.M.; visualization, C.C.-D.; supervision, C.M.; project administration, C.M. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding

Informed Consent Statement

Not applicable

Acknowledgments

The authors declare that the artificial-intelligence tool ChatGPT-5 (OpenAI, San Francisco, CA, USA) was used solely for linguistic refinement and limited graphical assistance during the final preparation stage of the manuscript. The tool provided support for English grammar correction, sentence rephrasing, readability optimization, and assistance in generating conceptual illustrative figure drafts under the direct supervision of the authors. No AI-generated scientific ideas, data, results, interpretations, or analytical conclusions were introduced into the study. All scientific concepts, literature synthesis, methodological interpretations, and final editorial decisions were entirely developed, validated, and approved by the authors. All figures and tables were critically reviewed, manually edited, and scientifically validated by the authors before submission to ensure methodological accuracy, transparency, and reproducibility. The authors confirm that every AI-assisted contribution complied with MDPI’s policy regarding the responsible and transparent use of artificial-intelligence tools in scholarly publishing.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

AI	Artificial Intelligence
AIR	Appendicitis Inflammatory Response
APACHE II	Acute Physiology and Chronic Health Evaluation II
ASA	American Society of Anesthesiologists Physical Status Classification
AUC	Area Under the Receiver Operating Characteristic Curve
CDSS	Clinical Decision Support System
CONSORT-AI	Consolidated Standards of Reporting Trials–Artificial Intelligence
CT	Computed Tomography
DECIDE-AI	Developmental and Exploratory Clinical Investigations of Decision Support Systems Driven by Artificial Intelligence
DL	Deep Learning
EAS	Emergency Abdominal Surgery
EHR	Electronic Health Record
ICU	Intensive Care Unit
LIME	Local Interpretable Model-Agnostic Explanations
ML	Machine Learning
NELA	National Emergency Laparotomy Audit
POSSUM	Physiological and Operative Severity Score for the Enumeration of Mortality and Morbidity
PROBAST	Prediction Model Risk of Bias Assessment Tool
P-POSSUM	Portsmouth Physiological and Operative Severity Score for the Enumeration of Mortality and Morbidity
qSOFA	Quick Sequential Organ Failure Assessment
SHAP	SHapley Additive exPlanations
SOFA	Sequential Organ Failure Assessment
SORT	Surgical Outcome Risk Tool
SPIRIT-AI	Standard Protocol Items: Recommendations for Interventional Trials–Artificial Intelligence
TRIPOD	Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis
WHO	World Health Organization
XAI	Explainable Artificial Intelligence
XGBoost	Extreme Gradient Boosting

References

Pearse, R.M.; Harrison, D.A.; James, P.; Watson, D.; Hinds, C.; Rhodes, A.; Grounds, R.M.; Bennett, E.D. Identification and characterisation of the high-risk surgical population in the United Kingdom. Crit. Care 2006, 10, R81. [Google Scholar] [CrossRef] [PubMed]
Symons, N.R.A.; Moorthy, K.; Almoudaris, A.M.; Bottle, A.; Aylin, P.; Vincent, C.A.; Faiz, O. Mortality in high-risk emergency general surgical admissions. Br. J. Surg. 2013, 100, 1318–1325. [Google Scholar] [CrossRef] [PubMed]
Scott, J.W.; Olufajo, O.A.; Brat, G.A.; Rose, J.A.; Zogg, C.K.; Haider, A.H.; Salim, A.; Havens, J.M. Use of national burden to define operative emergency general surgery. JAMA Surg. 2016, 151, e160480. [Google Scholar] [CrossRef] [PubMed]
Havens, J. M.; Olufajo, O. A.; Cooper, Z. R.; Haider, A. H.; Shah, A. A.; Salim, A. Defining Rates and Risk Factors for Readmissions Following Emergency General Surgery. JAMA Surg. 2016, 151(4), 330–336. [Google Scholar] [CrossRef] [PubMed]
Gale, S.C.; Shafi, S.; Dombrovskiy, V.Y.; Arumugam, D.; Crystal, J.S. The public health burden of emergency general surgery in the United States: A 10-year analysis of the Nationwide Inpatient Sample. J. Trauma Acute Care Surg. 2014, 77, 202–208. [Google Scholar] [CrossRef] [PubMed]
Ingraham, A.M.; Cohen, M.E.; Raval, M.V.; Ko, C.Y.; Nathens, A.B. Comparison of hospital performance in emergency versus elective general surgery operations at 198 hospitals. J. Am. Coll. Surg. 2011, 212, 20–28.e1. [Google Scholar] [CrossRef] [PubMed]
GlobalSurg Collaborative. Mortality of emergency abdominal surgery in high-, middle- and low-income countries. Br. J. Surg. 2016, 103, 971–988. [Google Scholar] [CrossRef] [PubMed]
Nepogodiev, D.; Martin, J.; Biccard, B.; Makupe, A.; Bhangu; A.; National Institute for Health Research Global Health Research Unit on Global Surgery. Global burden of postoperative death. Lancet 2019, 393, 401. [Google Scholar] [CrossRef] [PubMed]
Saunders, D.I.; Murray, D.; Pichel, A.C.; Varley, S.; Peden, C.J. Variations in mortality after emergency laparotomy: The first report of the UK Emergency Laparotomy Network. Br. J. Anaesth. 2012, 109, 368–375. [Google Scholar] [CrossRef] [PubMed]
Abbott, T.E.F.; Fowler, A.J.; Dobbs, T.D.; Harrison, E.M.; Gillies, M.A.; Pearse, R.M. Frequency of surgical treatment and related hospital procedures in the UK: A national ecological study using hospital episode statistics. Br. J. Anaesth. 2017, 119, 249–257. [Google Scholar] [CrossRef] [PubMed]
Saklad, M. Grading of patients for surgical procedures. Anesthesiology 1941, 2, 281–284. [Google Scholar] [CrossRef]
Knaus, W.A.; Draper, E.A.; Wagner, D.P.; Zimmerman, J.E. APACHE II: A severity of disease classification system. Crit. Care Med. 1985, 13, 818–829. [Google Scholar] [CrossRef]
Vincent, J.L.; Moreno, R.; Takala, J.; Willatts, S.; De Mendonça, A.; Bruining, H.; Reinhart, C.K.; Suter, P.M.; Thijs, L.G. The SOFA (Sepsis-related Organ Failure Assessment) score to describe organ dysfunction/failure. Intensive Care Med. 1996, 22, 707–710. [Google Scholar] [CrossRef] [PubMed]
Seymour, C.W.; Liu, V.X.; Iwashyna, T.J.; Brunkhorst, F.M.; Rea, T.D.; Scherag, A.; Rubenfeld, G.; Kahn, J.M.; Shankar-Hari, M.; Singer, M.; et al. Assessment of clinical criteria for sepsis: For the Third International Consensus Definitions for Sepsis and Septic Shock (Sepsis-3). JAMA 2016, 315, 762–774. [Google Scholar] [CrossRef] [PubMed]
Copeland, G.P.; Jones, D.; Walters, M. POSSUM: A scoring system for surgical audit. Br. J. Surg. 1991, 78, 355–360. [Google Scholar] [CrossRef] [PubMed]
Prytherch, D.R.; Whiteley, M.S.; Higgins, B.; Weaver, P.C.; Prout, W.G.; Powell, S.J. POSSUM and Portsmouth POSSUM for predicting mortality. Br. J. Surg. 1998, 85, 1217–1220. [Google Scholar] [CrossRef] [PubMed]
Protopapa, K.L.; Simpson, J.C.; Smith, N.C.E.; Moonesinghe, S.R. Development and validation of the Surgical Outcome Risk Tool (SORT). Br. J. Surg. 2014, 101, 1774–1783. [Google Scholar] [CrossRef] [PubMed]
Eugene, N.; Oliver, C. M.; Bassett, M. G.; Poulton, T. E.; Kuryba, A.; Johnston, C.; Anderson, I. D.; Moonesinghe, S. R.; Grocott, M. P.; Murray, D. M.; Cromwell, D. A.; Walker, K.; NELA collaboration. Development and internal validation of a novel risk adjustment model for adult patients undergoing emergency laparotomy surgery: the National Emergency Laparotomy Audit risk model. Br. J. Anaesth. 2018, 121(4), 739–748. [Google Scholar] [CrossRef] [PubMed]
National Emergency Laparotomy Audit (NELA) Project Team. Seventh Patient Report of the National Emergency Laparotomy Audit; Royal College of Anaesthetists: London, UK, 2021; Available online: https://www.nela.org.uk (accessed on 13 June 2026).
Oliver, C.M.; Walker, E.; Giannaris, S.; Grocott, M.P.W.; Moonesinghe, S.R. Risk assessment tools validated for patients undergoing emergency laparotomy: A systematic review. Br. J. Anaesth. 2015, 115, 849–860. [Google Scholar] [CrossRef] [PubMed]
Collins, G.S.; Reitsma, J.B.; Altman, D.G.; Moons, K.G.M. Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis (TRIPOD): The TRIPOD Statement. Ann. Intern. Med. 2015, 162, 55–63. [Google Scholar] [CrossRef] [PubMed]
Moons, K.G.M.; Altman, D.G.; Reitsma, J.B.; Ioannidis, J.P.A.; Macaskill, P.; Steyerberg, E.W.; Vickers, A.J.; Ransohoff, D.F.; Collins, G.S. Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis (TRIPOD): Explanation and Elaboration. Ann. Intern. Med. 2015, 162, W1–W73. [Google Scholar] [CrossRef] [PubMed]
Wolff, R.F.; Moons, K.G.M.; Riley, R.D.; Whiting, P.F.; Westwood, M.; Collins, G.S.; Reitsma, J.B.; Kleijnen, J.; Mallett, S. PROBAST: A Tool to Assess the Risk of Bias and Applicability of Prediction Model Studies. Ann. Intern. Med. 2019, 170, 51–58. [Google Scholar] [CrossRef] [PubMed]
Steyerberg, E.W. Clinical Prediction Models: A Practical Approach to Development, Validation, and Updating, 2nd ed.; Springer: Cham, Switzerland, 2019. [Google Scholar] [CrossRef]
Deo, R.C. Machine Learning in Medicine. Circulation 2015, 132, 1920–1930. [Google Scholar] [CrossRef] [PubMed]
Beam, A.L.; Kohane, I.S. Big Data and Machine Learning in Health Care. JAMA 2018, 319, 1317–1318. [Google Scholar] [CrossRef] [PubMed]
Rajkomar, A.; Dean, J.; Kohane, I. Machine Learning in Medicine. N. Engl. J. Med. 2019, 380, 1347–1358. [Google Scholar] [CrossRef] [PubMed]
Topol, E.J. High-Performance Medicine: The Convergence of Human and Artificial Intelligence. Nat. Med. 2019, 25, 44–56. [Google Scholar] [CrossRef] [PubMed]
Sidey-Gibbons, J.A.M.; Sidey-Gibbons, C.J. Machine Learning in Medicine: A Practical Introduction. BMC Med. Res. Methodol. 2019, 19, 64. [Google Scholar] [CrossRef] [PubMed]
Liu, X.; Rivera, S.C.; Moher, D.; Calvert, M.J.; Denniston, A.K. SPIRIT-AI and CONSORT-AI Working Group. Reporting Guidelines for Clinical Trial Reports for Interventions Involving Artificial Intelligence: The CONSORT-AI Extension. Nat. Med. 2020, 26, 1364–1374. [Google Scholar] [CrossRef] [PubMed]
Rivera, S.C.; Liu, X.; Chan, A.W.; Denniston, A.K.; Calvert, M.J. SPIRIT-AI and CONSORT-AI Working Group. Guidelines for Clinical Trial Protocols for Interventions Involving Artificial Intelligence: The SPIRIT-AI Extension. Nat. Med. 2020, 26, 1351–1363. [Google Scholar] [CrossRef] [PubMed]
Vasey, B.; Nagendran, M.; Campbell, B.; Clifton, D.A.; Collins, G.S.; Denaxas, S.; Gao, M.; Liu, X.; et al. Reporting Guideline for the Early-Stage Clinical Evaluation of Decision Support Systems Driven by Artificial Intelligence: DECIDE-AI. Nat. Med. 2022, 28, 924–933. [Google Scholar] [CrossRef] [PubMed]
Hashimoto, D.A.; Rosman, G.; Rus, D.; Meireles, O.R. Artificial Intelligence in Surgery: Promises and Perils. Ann. Surg. 2018, 268, 70–76. [Google Scholar] [CrossRef] [PubMed]
Loftus, T.J.; Tighe, P.J.; Filiberto, A.C.; Efron, P.A.; Brakenridge, S.C.; Mohr, A.M.; Rashidi, P.; Upchurch, G.R.; Bihorac, A. Artificial Intelligence and Surgical Decision-Making. JAMA Surg. 2020, 155, 148–158. [Google Scholar] [CrossRef] [PubMed]
Bihorac, A.; Ozrazgat-Baslanti, T.; Ebadi, A.; Motaei, A.; Madkour, M.; Pardalos, P.M.; Lipori, G.; Hogan, W.R.; Efron, P.A.; Moore, F.A.; et al. MySurgeryRisk: Development and Validation of a Machine-Learning Risk Algorithm for Major Complications and Death After Surgery. Ann. Surg. 2019, 269, 652–662. [Google Scholar] [CrossRef] [PubMed]
Lee, C.K.; Hofer, I.; Gabel, E.; Baldi, P.; Cannesson, M. Development and Validation of a Deep Neural Network Model for Prediction of Postoperative In-Hospital Mortality. Anesthesiology 2018, 129, 649–662. [Google Scholar] [CrossRef] [PubMed]
Corey, K. M.; Kashyap, S.; Lorenzi, E.; Lagoo-Deenadayalan, S. A.; Heller, K.; Whalen, K.; Balu, S.; Heflin, M. T.; McDonald, S. R.; Swaminathan, M.; Sendak, M. Development and validation of machine learning models to identify high-risk surgical patients using automatically curated electronic health record data (Pythia): A retrospective, single-site study. PLoS Med. 2018, 15(11), e1002701. [Google Scholar] [CrossRef] [PubMed]
Merath, K.; Hyer, J. M.; Mehta, R.; Farooq, A.; Bagante, F.; Sahara, K.; Tsilimigras, D. I.; Beal, E.; Paredes, A. Z.; Wu, L.; Ejaz, A.; Pawlik, T. M. Use of Machine Learning for Prediction of Patient Risk of Postoperative Complications After Liver, Pancreatic, and Colorectal Surgery. Journal of gastrointestinal surgery: official journal of the Society for Surgery of the Alimentary Tract 2020, 24(8), 1843–1851. [Google Scholar] [CrossRef] [PubMed]
Senders, J.T.; Zaki, M.M.; Karhade, A.V.; Chang, B.; Gormley, W.B.; Broekman, M.L.D.; Smith, T.R.; Arnaout, O. An Introduction and Overview of Machine Learning in Neurosurgical Care. Acta Neurochir. 2018, 160, 29–38. [Google Scholar] [CrossRef] [PubMed]
Hassan, A. M.; Rajesh, A.; Asaad, M.; Nelson, J. A.; Coert, J. H.; Mehrara, B. J.; Butler, C. E. Artificial Intelligence and Machine Learning in Prediction of Surgical Complications: Current State, Applications, and Implications. Am. Surg. 2023, 89(1), 25–30. [Google Scholar] [CrossRef] [PubMed]
Alvarado, A. A practical score for the early diagnosis of acute appendicitis. Ann. Emerg. Med. 1986, 15, 557–564. [Google Scholar] [CrossRef] [PubMed]
Andersson, M.; Andersson, R.E. The Appendicitis Inflammatory Response score: A tool for the diagnosis of acute appendicitis that outperforms the Alvarado score. World J. Surg. 2008, 32, 1843–1849. [Google Scholar] [CrossRef] [PubMed]
Sammalkorpi, H.E.; Mentula, P.; Leppäniemi, A. A new adult appendicitis score improves diagnostic accuracy of acute appendicitis. BMC Gastroenterol. 2014, 14, 114. [Google Scholar] [CrossRef] [PubMed]
Atema, J.J.; van Rossem, C.C.; Leeuwenburgh, M.M.N.; Stoker, J.; Boermeester, M.A. Scoring system to distinguish uncomplicated from complicated acute appendicitis. Br. J. Surg. 2015, 102, 979–990. [Google Scholar] [CrossRef] [PubMed]
Ismayilzada, K. Artificial intelligence for acute appendicitis diagnosis: A systematic review of current evidence, challenges, and future directions. Medicine 2026, 105(12), e48094. [Google Scholar] [CrossRef] [PubMed]
Byun, J.; Park, S.; Hwang, S. M. Diagnostic Algorithm Based on Machine Learning to Predict Complicated Appendicitis in Children Using CT, Laboratory, and Clinical Features. Diagnostics 2023, 13(5), 923. [Google Scholar] [CrossRef] [PubMed]
Eickhoff, R. M.; Bulla, A.; Eickhoff, S. B.; Heise, D.; Helmedag, M.; Kroh, A.; Schmitz, S. M.; Klink, C. D.; Neumann, U. P.; Lambertz, A. Machine learning prediction model for postoperative outcome after perforated appendicitis. Langenbeck's Arch. Surg. 2022, 407(2), 789–795. [Google Scholar] [CrossRef] [PubMed]
Wei, W.; Tongping, S.; Jiaming, W. Construction of a clinical prediction model for complicated appendicitis based on machine learning techniques. Sci. Rep. 2024, 14(1), 16473. [Google Scholar] [CrossRef] [PubMed]
Schipper, A.; Belgers, P.; O'Connor, R.; Jie, K. E.; Dooijes, R.; Bosma, J. S.; Kurstjens, S.; Kusters, R.; van Ginneken, B.; Rutten, M. Machine-learning based prediction of appendicitis for patients presenting with acute abdominal pain at the emergency department. World J. Emerg. Surg. WJES 2024, 19(1), 40. [Google Scholar] [CrossRef] [PubMed]
Males, I.; Boban, Z.; Kumric, M.; et al. Applying an explainable machine learning model might reduce the number of negative appendectomies in pediatric patients with a high probability of acute appendicitis. Sci. Rep. 2024, 14, 12772. [Google Scholar] [CrossRef] [PubMed]
Yokoe, M.; Hata, J.; Takada, T.; Strasberg, S.M.; Asbun, H.J.; Wakabayashi, G.; Mori, Y.; Okamoto, K.; Iwashita, Y.; Hibi, T.; et al. Tokyo Guidelines 2018: Diagnostic criteria and severity grading of acute cholecystitis (with videos). J. Hepatobiliary Pancreat. Sci. 2018, 25, 41–54. [Google Scholar] [CrossRef] [PubMed]
Okamoto, K.; Suzuki, K.; Takada, T.; Strasberg, S.M.; Asbun, H.J.; Endo, I.; Iwashita, Y.; Hibi, T.; Pitt, H.A.; Umezawa, A.; et al. Tokyo Guidelines 2018: Flowchart for the management of acute cholecystitis. J. Hepatobiliary Pancreat. Sci. 2018, 25, 55–72. [Google Scholar] [CrossRef] [PubMed]
Basukala, S.; Neupane, S. C.; Shrestha, O.; Thapa, N.; Neupane, S.; Tamang, S. Preoperative prediction of difficult laparoscopic cholecystectomy: an institutional cross-sectional study. Ann. Med. Surg. (2012) 2025, 88(1), 144–150. [Google Scholar] [CrossRef] [PubMed]
Mascagni, P.; Vardazaryan, A.; Alapatt, D.; Urade, T.; Emre, T.; Fiorillo, C.; Pessaux, P.; Mutter, D.; Marescaux, J.; Costamagna, G.; Dallemagne, B.; Padoy, N. Artificial Intelligence for Surgical Safety: Automatic Assessment of the Critical View of Safety in Laparoscopic Cholecystectomy Using Deep Learning. Ann. Surg. 2022, 275(5), 955–961. [Google Scholar] [CrossRef] [PubMed]
Ward, T. M.; Hashimoto, D. A.; Ban, Y.; Rosman, G.; Meireles, O. R. Artificial intelligence prediction of cholecystectomy operative course from automated identification of gallbladder inflammation. Surg. Endosc. 2022, 36(9), 6832–6840. [Google Scholar] [CrossRef] [PubMed]
Orimoto, H.; Hirashita, T.; Ikeda, S.; Amano, S.; Kawamura, M.; Kawano, Y.; Takayama, H.; Masuda, T.; Endo, Y.; Matsunobu, Y.; Shinozuka, K.; Tokuyasu, T.; Inomata, M. Development of an artificial intelligence system to indicate intraoperative findings of scarring in laparoscopic cholecystectomy for cholecystitis. Surg. Endosc. 2025, 39(2), 1379–1387. [Google Scholar] [CrossRef] [PubMed]
Hu, Y.; Chen, Y.; Zhao, H. Development and Validation of an Explainable Machine Learning Model for Gangrenous Cholecystitis Prediction: A Multicenter Retrospective Study. J. Inflamm. Res. 2025, 18, 17747–17758. [Google Scholar] [CrossRef] [PubMed]
Sun, R. T.; Li, C. L.; Jiang, Y. M.; Hao, A. Y.; Liu, K.; Li, K.; Tan, B.; Yang, X. N.; Cui, J. F.; Bai, W. Y.; Hu, W. Y.; Cao, J. Y.; Qu, C. A radiomics-clinical predictive model for difficult laparoscopic cholecystectomy based on preoperative CT imaging: a retrospective single center study. World J. Emerg. Surg. WJES 2025, 20(1), 62. [Google Scholar] [CrossRef] [PubMed]
Cicerone, O.; Frassini, S.; Gallotti, A.; Vanoli, A.; Ansaloni, L.; Maestri, M.; Fugazzola, P.; S P Ri, M. A. C. C.; Collaborative Group. Development and validation of machine learning tools for predicting postoperative complications in acute calculous cholecystitis. In Updates in surgery; Advance online publication, 2025. [Google Scholar] [CrossRef] [PubMed]
Chen, B. Q.; Zang, W.; Liu, J. X.; Yang, Y.; Zhang, X. L.; Ju, R. H. Deep learning enables accurate diagnosis of acute cholecystitis and prediction of suppuration using noncontrast CT. iScience 2025, 28(12), 114180. [Google Scholar] [CrossRef] [PubMed]
Catena, F.; De Simone, B.; Coccolini, F.; Di Saverio, S.; Sartelli, M.; Ansaloni, L.; van Goor, H.; Moore, E.E.; Jeekel, J.; Biffl, W.; et al. Bowel obstruction: A narrative review for all physicians. World J. Emerg. Surg. 2019, 14, 20. [Google Scholar] [CrossRef] [PubMed]
Zielinski, M. D.; Eiken, P. W.; Bannon, M. P.; Heller, S. F.; Lohse, C. M.; Huebner, M.; Sarr, M. G. Small bowel obstruction-who needs an operation? A multivariate prediction model. World J. Surg. 2010, 34(5), 910–919. [Google Scholar] [CrossRef] [PubMed]
Schwenter, F.; Poletti, P. A.; Platon, A.; Perneger, T.; Morel, P.; Gervaz, P. Clinicoradiological score for predicting the risk of strangulated small bowel obstruction. Br. J. Surg. 2010, 97(7), 1119–1125. [Google Scholar] [CrossRef] [PubMed]
ten Broek, R.P.G.; Krielen, P.; Di Saverio, S.; Coccolini, F.; Biffl, W.L.; Ansaloni, L.; Velmahos, G.C.; Sartelli, M.; Fraga, G.P.; Kelly, M.D.; et al. Bologna guidelines for diagnosis and management of adhesive small bowel obstruction: 2017 update. World J. Emerg. Surg. 2018, 13, 24. [Google Scholar] [CrossRef] [PubMed]
Mathiszig-Lee, J. F.; Catling, F. J. R.; Moonesinghe, S. R.; Brett, S. J. Highlighting uncertainty in clinical risk prediction using a model of emergency laparotomy mortality risk. npj Digit. Med. 2022, 5(1), 70. [Google Scholar] [CrossRef] [PubMed]
Mazzotta, A.D.; Ferrara, F.; Geraci, G.; Cennamo, V.; Sica, G.S.; Coletta, D.; et al. Machine learning approaches for prediction of major complications in patients undergoing emergency surgery for bowel obstruction. J. Pers. Med. 2024, 14, 1043. [Google Scholar] [CrossRef] [PubMed]
Jones, D.; Blum, J.; Cartwright, C.; Verhagen, N.; Xu, S.; Denholm, B.; Southcott, L.; Turner, R. Applying Machine Learning to the ANZELA-QI Database to Predict Adverse Outcomes for Patients Undergoing Emergency Laparotomy. ANZ J. Surg. 2025, 95(10), 2080–2087. [Google Scholar] [CrossRef] [PubMed]
Soliman, H.; Smith, C.; Mena, J.; Yusuf, G. T.; Helmy, A. H. External validation of HAS model in predicting mortality after emergency laparotomy: a retrospective cohort study. Ann. R. Coll. Surg. Engl. 2025, 107(8), 540–544. [Google Scholar] [CrossRef] [PubMed]
Yuan, J. H.; Jin, Y. M.; Xiang, J. Y.; Li, S. S.; Zhong, Y. X.; Zhang, S. L.; Zhao, B. Machine learning-based prediction of postoperative mortality risk after abdominal surgery. World J. Gastrointest. Surg. 2025, 17(4), 103696. [Google Scholar] [CrossRef] [PubMed]
Moonesinghe, S. R.; Mythen, M. G.; Das, P.; Rowan, K. M.; Grocott, M. P. Risk stratification tools for predicting morbidity and mortality in adult patients undergoing major surgery: qualitative systematic review. Anesthesiology 2013, 119(4), 959–981. [Google Scholar] [CrossRef] [PubMed]
Sartelli, M.; Chichom-Mefire, A.; Labricciosa, F.M.; Hardcastle, T.C.; Abu-Zidan, F.M.; Adesunkanmi, A.K.; Ansaloni, L.; Bala, M.; Balogh, Z.J.; Beltrán, M.A.; et al. The management of intra-abdominal infections from a global perspective: 2017 WSES guidelines for management of intra-abdominal infections. World J. Emerg. Surg. 2017, 12, 29. [Google Scholar] [CrossRef] [PubMed]
Sartelli, M.; Kluger, Y.; Ansaloni, L.; Hardcastle, T.C.; Rasa, K.; Coccolini, F.; Baiocchi, G.L.; Di Saverio, S.; Moore, E.E.; Coimbra, R.; et al. Raising concerns about the Sepsis-3 definitions. World J. Emerg. Surg. 2018, 13, 6. [Google Scholar] [CrossRef] [PubMed]
Seymour, C.W.; Kennedy, J.N.; Wang, S.; Chang, C.C.H.; Elliott, C.F.; Xu, Z.; Berry, S.; Clermont, G.; Cooper, G.; Gomez, H.; et al. Derivation, validation, and potential treatment implications of novel clinical phenotypes for sepsis. JAMA 2019, 321, 2003–2017. [Google Scholar] [CrossRef] [PubMed]
Komorowski, M.; Celi, L.A.; Badawi, O.; Gordon, A.C.; Faisal, A.A. The Artificial Intelligence Clinician learns optimal treatment strategies for sepsis in intensive care. Nat. Med. 2018, 24, 1716–1720. [Google Scholar] [CrossRef] [PubMed]
Acosta, S. Epidemiology of mesenteric vascular disease: Clinical implications. Semin. Vasc. Surg. 2010, 23, 4–8. [Google Scholar] [CrossRef] [PubMed]
Kärkkäinen, J.M.; Acosta, S. Acute mesenteric ischemia (Part I): Incidence, etiologies, and how to improve early diagnosis. Best Pract. Res. Clin. Gastroenterol. 2017, 31, 15–25. [Google Scholar] [CrossRef] [PubMed]
Bala, M.; Kashuk, J.; Moore, E.E.; Kluger, Y.; Biffl, W.; Gomes, C.A.; Ben-Ishay, O.; Rubinstein, C.; Balogh, Z.J.; Civil, I.; et al. Acute mesenteric ischemia: Updated guidelines of the World Society of Emergency Surgery. World J. Emerg. Surg. 2022, 17, 54. [Google Scholar] [CrossRef] [PubMed]
Nuzzo, A.; Maggiori, L.; Ronot, M.; Becq, A.; Plessier, A.; Gault, N.; Joly, F.; Castier, Y.; Vilgrain, V.; Paugam, C.; Panis, Y.; Bouhnik, Y.; Cazals-Hatem, D.; Corcos, O. Predictive Factors of Intestinal Necrosis in Acute Mesenteric Ischemia: Prospective Study from an Intestinal Stroke Center. Am. J. Gastroenterol. 2017, 112(4), 597–605. [Google Scholar] [CrossRef] [PubMed]
Lundberg, S.M.; Lee, S.I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 2017, 30, 4765–4774. [Google Scholar]
Ribeiro, M.T.; Singh, S.; Guestrin, C. “Why Should I Trust You?”: Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 1135–1144. [Google Scholar] [CrossRef]
Kelly, C.J.; Karthikesalingam, A.; Suleyman, M.; Corrado, G.; King, D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 2019, 17, 195. [Google Scholar] [CrossRef] [PubMed]
Topol, E.J. High-performance medicine: The convergence of human and artificial intelligence. Nat. Med. 2019, 25, 44–56. [Google Scholar] [CrossRef] [PubMed]
Vollmer, S.; Mateen, B.A.; Bohner, G.; Király, F.J.; Ghani, R.; Jonsson, P.; Cumbers, S.; Jonas, A.; McAllister, K.S.L.; Myles, P.; et al. Machine learning and artificial intelligence research for patient benefit: 20 critical questions on transparency, replicability, ethics, and effectiveness. BMJ 2020, 368, l6927. [Google Scholar] [CrossRef] [PubMed]
Sendak, M.P.; D'Arcy, J.; Kashyap, S.; Gao, M.; Nichols, M.; Corey, K.; Ratliff, W.; Balu, S. A path for translation of machine learning products into healthcare delivery. EMJ Innov. 2020, 4, 27–35. [Google Scholar] [CrossRef]
Van Calster, B.; Wynants, L.; Timmerman, D.; Steyerberg, E.W.; Collins, G.S. Predictive analytics in health care: How can we know it works? J. Am. Med. Inform. Assoc. 2019, 26, 1651–1654. [Google Scholar] [CrossRef] [PubMed]
Collins, G.S.; Dhiman, P.; Andaur Navarro, C.L.; Ma, J.; Hooft, L.; Reitsma, J.B.; Logullo, P.; Beam, A.L.; Peng, L.; Van Calster, B.; et al. Protocol for development of TRIPOD-AI and PROBAST-AI reporting and risk of bias tools. BMJ Open 2021, 11, e048008. [Google Scholar] [CrossRef] [PubMed]
Sounderajah, V.; Ashrafian, H.; Golub, R.M.; Shetty, S.; De Fauw, J.; Hooft, L.; Moons, K.G.M.; Darzi, A.; Devi, R. CONSORT-AI and SPIRIT-AI Collaborators. Developing a reporting guideline for artificial intelligence-centred diagnostic test accuracy studies: STARD-AI protocol. BMJ Open 2021, 11, e047709. [Google Scholar] [CrossRef] [PubMed]
Lambin, P.; Leijenaar, R.T.H.; Deist, T.M.; Peerlings, J.; de Jong, E.E.C.; van Timmeren, J.; Sanduleanu, S.; Larue, R.T.H.M.; Even, A.J.G.; Jochems, A.; et al. Radiomics: The bridge between medical imaging and personalized medicine. Nat. Rev. Clin. Oncol. 2017, 14, 749–762. [Google Scholar] [CrossRef] [PubMed]
Gillies, R.J.; Kinahan, P.E.; Hricak, H. Radiomics: Images are more than pictures, they are data. Radiology 2016, 278, 563–577. [Google Scholar] [CrossRef] [PubMed]
European Parliament and Council of the European Union. Regulation (EU) 2024/1689 of the European Parliament and of the Council laying down harmonised rules on artificial intelligence (Artificial Intelligence Act). Off. J. Eur. Union 2024, L1689. [Google Scholar]
World Health Organization. Ethics and Governance of Artificial Intelligence for Health; WHO: Geneva, Switzerland, 2021. [Google Scholar]
Benjamens, S.; Dhunnoo, P.; Meskó, B. The state of artificial intelligence-based FDA-approved medical devices and algorithms: An online database. npj Digit. Med. 2020, 3, 118. [Google Scholar] [CrossRef] [PubMed]

Figure 1. Evolution from Conventional Risk Scores to AI-Based Risk Prediction Models. Evolution from conventional risk assessment toward artificial intelligence–driven risk prediction in emergency abdominal surgery. Traditional clinical judgment and static risk scores have evolved into advanced predictive systems leveraging machine learning, deep learning, radiomics, and explainable artificial intelligence. These technologies enable personalized risk estimation and support real-time clinical decision-making, representing a major step toward precision emergency surgery.

Figure 2. Explainable AI Framework for Emergency Surgical Risk Prediction. Explainable AI Framework for Emergency Surgical Risk Prediction should be placed immediately after this subsection. The figure should illustrate the integration of clinical data, laboratory parameters, imaging features, machine learning algorithms, explainability tools (SHAP/LIME), and clinician-supported decision-making within a unified emergency surgery workflow.

Table 1. Conventional Risk Assessment Scores Used in Emergency Surgery.

Score	Main Variables	Primary Outcome Predicted	Advantages	Limitations
ASA Physical Status Classification	Overall health status, comorbidities, functional reserve	Perioperative morbidity and mortality	Simple, universally used, rapid bedside assessment	Subjective interpretation; interobserver variability; lacks procedure-specific variables
APACHE II	Physiological parameters, laboratory values, age, chronic health conditions	Hospital mortality and critical illness severity	Well validated in critically ill patients; comprehensive physiological assessment	Complex calculation; requires multiple laboratory measurements; limited emergency surgery specificity
SOFA	Respiratory, cardiovascular, hepatic, coagulation, renal, and neurological function	Organ dysfunction and mortality risk	Effective for monitoring organ failure progression; widely adopted in sepsis	Not specifically designed for surgical patients; requires repeated measurements
qSOFA	Respiratory rate, systolic blood pressure, mental status	Identification of high-risk septic patients	Rapid bedside application; no laboratory data required	Lower sensitivity than SOFA; limited predictive accuracy when used alone
POSSUM	Physiological variables and operative severity parameters	Postoperative morbidity and mortality	Incorporates both patient and surgical factors; extensively validated	May overestimate mortality in low-risk patients; relatively complex
P-POSSUM	Modified POSSUM equation using physiological and operative variables	Postoperative mortality	Improved mortality prediction compared with original POSSUM	Requires operative findings; less useful for preoperative decision-making
SORT	Age, ASA class, urgency, surgical severity, procedure type	30-day postoperative mortality	Simple and practical; good discrimination in diverse surgical populations	Limited incorporation of dynamic physiological variables
NELA Risk Model	Age, physiological status, laboratory parameters, urgency, operative factors	30-day mortality after emergency laparotomy	Specifically developed for emergency laparotomy; strong predictive performance	Primarily validated in laparotomy populations; external validation required across different healthcare systems

*Summary of the most commonly used conventional risk assessment tools in emergency surgery, highlighting their principal variables, predicted outcomes, advantages, and limitations. Although these scoring systems remain valuable for perioperative risk estimation, their static structure and limited ability to model complex nonlinear interactions have stimulated the development of artificial intelligence–based predictive approaches.

Table 3. Challenges and Limitations of AI Implementation in Emergency Surgery.

Domain	Challenge	Potential Solution
Data Quality	Missing/incomplete data	Standardized data collection
Validation	Single-center development	Multicenter external validation
Interpretability	Black-box models	Explainable AI (SHAP, LIME)
Bias	Underrepresented populations	Fairness testing and monitoring
Regulation	Compliance requirements	AI Act and WHO frameworks
Integration	Workflow disruption	EHR-integrated decision support
Adoption	Limited clinician trust and acceptance	Transparent, validated, user-friendly AI systems

The principal challenges currently limiting the widespread implementation of artificial intelligence (AI) systems in emergency abdominal surgery. These barriers encompass technical, methodological, ethical, regulatory, and organizational domains, including issues related to data quality, external validation, model interpretability, algorithmic bias, regulatory compliance, workflow integration, and clinician acceptance. For each challenge, potential mitigation strategies are presented, highlighting the importance of standardized data collection, multicenter validation, explainable AI techniques, fairness monitoring, regulatory oversight, and seamless integration into clinical decision-support systems. Addressing these limitations will be essential for the safe, effective, and sustainable adoption of AI-driven risk stratification tools in emergency surgical practice.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.