Background/Objectives: Mitral valve surgery is associated with substantial perioperative heterogeneity and risk of postoperative complications. Although established risk scores such as EuroSCORE II provide population-level prognostic estimates, their performance may be limited in specific surgical populations and institutional settings. This pilot study aimed to develop and internally validate an interpretable center-specific machine learning model for perioperative risk stratification following mitral valve surgery and to explore its translational implementation through a prototype clinical application.
Methods: A retrospective single-center study was conducted including 211 consecutive patients undergoing mitral valve surgery with ring implantation. Routinely available demographic, laboratory, and perioperative variables were evaluated as candidate predictors. The primary endpoint was a composite of major postoperative complications, including in-hospital mortality, stroke, conversion to sternotomy, and rethoracotomy. Predictive approaches included logistic regression, LASSO regression, and random forest classification. Internal validation was performed using 5-fold cross-validation and bootstrap resampling. Model explainability was assessed using regression coefficients and SHAP (SHapley Additive exPlanations) analysis.
Results: The composite endpoint occurred in 34 patients (16.1%). Among evaluated approaches, the simplified logistic regression model demonstrated the most favorable balance between interpretability and predictive performance, achieving a test-set AUC of 0.67 and a bootstrap-validated AUC of 0.69. Five-fold cross-validation yielded a mean AUC of 0.75. LASSO regression achieved the highest cross-validated discrimination (AUC 0.78) while retaining only a limited subset of predictors, with cardiopulmonary bypass time emerging as the dominant variable. Across models, higher age, serum creatinine concentration, cardiopulmonary bypass duration, and cross-clamp time were associated with increased complication risk, whereas higher hemoglobin levels were associated with lower risk.
Conclusions: This pilot study demonstrates the feasibility of developing interpretable center-specific machine learning models for perioperative risk stratification following mitral valve surgery. Simplified regression-based approaches provided clinically transparent predictions with moderate discriminatory performance, while penalized models showed potential for improved generalizability. Further multicenter validation is required before clinical implementation.