Preprint
Article

This version is not peer-reviewed.

Prediction of New Onset Atrial Fibrillation in Patients with Acute Myocardial Infarction Using Artificial Intelligence and Machine Learning Algorithms

Submitted:

01 July 2026

Posted:

03 July 2026

You are already at the latest version

Abstract
Background/Objectives: New-onset atrial fibrillation is a common complication of acute myocardial infarction and is associated with increased mortality, heart failure, and recurrent cardiovascular events. Prediction models for the occurrence of atrial fibrillation may facilitate early risk stratification, improve patient monitoring, and support personalized therapeutic decision-making in clinical practice. Methods: A prospective study included 150 patients with acute myocardial infarction admitted within 24 hours of symptom onset. Clinical, laboratory, electrocardiographic, and echocardiographic data were collected, and a machine learning model was developed to predict new-onset atrial fibrillation. Model performance was evaluated using ROC-AUC, while SHAP analysis was applied to identify and quantify the contribution of individual predictors. Results: The machine learning model demonstrated excellent predictive performance, achieving an accuracy of 97.0%, ROC-AUC of 0.991, and precision–recall AUC of 0.991 in the independent test set. SHapley Additive Explanations (SHAP) and permutation importance analyses identified the E/e′ ratio, left ventricular mass index, left ventricular ejection fraction, left atrial volume index, oxidative stress markers (malondialdehyde and superoxide dismutase), NT-proBNP, and complete revascularization as the most influential predictors of new-onset atrial fibrillation after acute myocardial infarction. Conclusions: Machine learning combined with SHAP-based explainability showed high potential for predicting new-onset atrial fibrillation after acute myocardial infarction. This approach may improve individualized risk assessment and clinical decision-making, pending validation in larger multicenter cohorts.
Keywords: 
;  ;  

1. Introduction

Several studies have reported a significant increase in the incidence of atrial fibrillation (AF) during acute myocardial infarction (AMI). However, it remains unclear whether newly onset AF carries the same implications for short-term clinical outcomes as well as long-term prognosis. A precise understanding of this phenomenon is essential for appropriate risk stratification and for optimizing therapeutic strategies, including decisions regarding anticoagulation and overall patient management [1,2]. In 2021, atrial fibrillation and atrial flutter contributed substantially to the global burden of disease, with a total of 4,484,926 new cases reported worldwide. The age-standardized incidence rate was 52.12 per 100,000 individuals. In addition, the overall prevalence reached 52,552,045 cases, with an age-standardized prevalence rate of 620.51 per 100,000 [3]. AF is associated with a markedly increased risk of adverse cardiovascular events, including AMI [4]. Prospective cohort studies have shown that individuals with AF have a higher likelihood of developing AMI compared with those without this arrhythmia. Furthermore, multiple investigations have demonstrated an increased occurrence of AF both in the immediate period following myocardial infarction and during long-term follow-up [5]. Reported rates of de novo AF in the setting of acute coronary syndrome vary widely, ranging from 2.3% to 21% [6]. New-onset symptomatic AF is commonly observed within the first few days following AMI; however, silent forms appear to be approximately three times more prevalent. In a study conducted by Stamboul et al., which employed continuous ECG monitoring for at least 48 hours after admission for myocardial infarction, 135 out of 849 patients (16%) developed silent AF, compared with only 45 out of 849 (5%) who presented with symptomatic AF [7]. Patients with silent AF were significantly older, more frequently female, and less likely to be smokers. Similarly to symptomatic AF, the silent form was associated with an unfavorable prognosis at one year after acute coronary syndrome, with a marked increase in cardiovascular mortality and hospitalizations for heart failure compared with patients without AF (5.7% vs. 2.0%, p < 0.001 and 6.6% vs. 1.3%, p < 0.001). Wi et al. further defined transient new-onset AF as arrhythmia occurring in the setting of myocardial infarction, without prior history and with no documentation one month after hospital discharge (observed in 4.8% of patients). This transient form of AF was associated with the poorest clinical outcomes, serving as a strong independent predictor of major adverse cardiovascular events and mortality in patients with myocardial infarction at one month, as well as at two- and five-year follow-up [8].
The development of predictive models represents a promising strategy for risk stratification in patients with AMI. By integrating clinical, electrocardiographic, echocardiographic, and laboratory parameters, predictive models can identify vulnerable patients during the early stages of hospitalization, allowing clinicians to implement closer monitoring and individualized preventive interventions. Wang et al. demonstrated that electrocardiographic markers, particularly P-wave amplitude in lead V1, can be incorporated into a predictive model with good discriminative ability for new-onset AF after AMI [9]. Furthermore, recent investigations have focused on constructing nomograms and multivariable prediction tools to estimate the risk of AF development after percutaneous coronary intervention in patients with AMI. Zhang et al. developed and validated a nomogram that showed satisfactory predictive performance and may support clinical decision-making in routine practice [10]. The growing application of artificial intelligence and machine learning techniques in cardiology has further enhanced the potential of predictive modeling. Machine learning algorithms can analyze complex nonlinear relationships among numerous clinical variables and identify hidden patterns associated with adverse cardiovascular events. Jeong et al. demonstrated that dynamic machine learning models can accurately predict clinical outcomes after acute myocardial infarction, highlighting the potential of these approaches for improving personalized patient management [11].
However, most existing models were constructed using traditional statistical approaches and are characterized by several important limitations. Conventional scores generally assume linear relationships between predictors and outcomes, which may fail to capture the complex interactions among clinical, laboratory, echocardiographic, and electrocardiographic variables involved in the development of new-onset atrial fibrillation. Also, many currently available models were derived from relatively small cohorts and often demonstrate limited generalizability when applied to external populations. Recent advances in artificial intelligence have demonstrated superior predictive performance compared with traditional statistical risk scores in various cardiovascular conditions. However, one of the main challenges limiting the clinical implementation of artificial intelligence models is their lack of interpretability. To address this issue, explainable artificial intelligence techniques, particularly SHapley Additive Explanations (SHAP), have been increasingly integrated into predictive models. SHAP enables the identification and quantification of the contribution of each variable to the final prediction, thereby improving transparency and clinician confidence in model outputs [12]. SHAP-based approaches have been applied to various arrhythmia prediction and atrial fibrillation risk assessment, prediction of heart failure, cardiotoxicity, etc. providing insights into the relative importance of electrocardiographic, laboratory, and echocardiographic variables [13,14]. Given the limited explainability of existing prediction models for new-onset atrial fibrillation after acute myocardial infarction, it would be valuable to investigate whether SHAP analysis can improve understanding of the factors driving atrial fibrillation development in this population. Such an approach could not only enhance predictive accuracy but also provide clinically meaningful explanations that support individualized risk stratification and decision-making.

2. Materials and Methods

This prospective investigation was carried out at the Institute of Cardiology in Chișinău, Republic of Moldova. A total of 150 adult participants (aged 18 years and older) with a confirmed diagnosis of acute myocardial infarction were consecutively enrolled after providing written informed consent. Ethical approval was obtained from the Ethics Committee of the Nicolae Testemițanu State University of Medicine and Pharmacy (session dated 12 April 2018, approval No. 54). Eligibility was restricted to patients presenting within 24 hours from symptom onset. The research protocol adhered strictly to the ethical standards outlined in the Declaration of Helsinki. Individuals were excluded if they had a non-cardiac condition limiting life expectancy to under two years, a history of AF, cognitive impairment, or substance abuse. Participants were allocated into two equally sized cohorts: 75 patients who developed new-onset AF during the acute phase of myocardial infarction, and 75 patients who maintained sinus rhythm throughout hospitalization. All subjects underwent comprehensive clinical assessment, laboratory testing, and echocardiographic evaluation. Enrollment was performed using a random allocation process. Following discharge, patients were monitored over a two-year period, during which prespecified outcomes were recorded, including hospital admissions due to heart failure, cardiovascular mortality, and deaths from non-cardiovascular causes.

Statistical Analysis

Data processing and statistical evaluation were performed using RStudio (version 2024.09.1+394) alongside Python (version 3.12.3), enabling a fully reproducible analytical workflow. Continuous variables were described using appropriate summary measures such as mean, standard deviation, median, interquartile range, and range. Group differences for continuous variables were assessed with the Mann–Whitney U test. Categorical data were presented as counts and percentages, accompanied by 95% confidence intervals. Relationships between categorical variables were evaluated using Pearson’s chi-square test, with statistical significance determined through a Monte Carlo simulation approach based on 100,000 iterations. A two-tailed p-value threshold of 0.05 was considered indicative of statistical significance throughout the analysis.
The predictive model developed for identifying the risk of new-onset atrial fibrillation in patients with acute myocardial infarction will be assessed using standard discrimination metrics, including the area under the receiver operating characteristic curve (AUC-ROC). To enhance model interpretability, SHAP will be applied to quantify the influence of individual variables on the prediction outcome. Global and local SHAP analyses will be performed to determine the most relevant predictors associated with atrial fibrillation occurrence and to evaluate their contribution across different risk profiles. Graphical representations, such as SHAP summary plots and dependence plots, will be used to visualize feature importance and explore potential interactions between clinical, laboratory, electrocardiographic and echocardiographic parameters.

3. Results

3.1. General Characteristics of the Study Population

The study population consisted of 150 patients diagnosed with acute myocardial infarction. Men represented the majority of the cohort, accounting for 100 participants (66.7%, 95% CI: 59–74), while women comprised 50 participants (33.3%, 95% CI: 26–41). In terms of place of residence, 72 individuals (48.0%, 95% CI: 40–56) lived in urban areas and 78 (52.0%, 95% CI: 44–60) resided in rural settings. Participants ranged in age from 37 to 90 years, with a median age of 67.0 years (IQR = 13.0). Stratification by age revealed that approximately 58 patients (38.7%) were younger than 65 years, 72 patients (48.0%) were between 65 and 79 years of age, and 20 patients (13.3%) were aged 80 years or older. Assessment of lifestyle characteristics indicated that low levels of daily physical activity predominated within the cohort. A sedentary or minimally active lifestyle was reported by 107 participants (71.3%), whereas moderate physical activity was observed in 41 individuals (27.3%). Body weight measurements varied between 55 and 132 kg, with a median value of 86.5 kg (IQR = 22.8). The median body mass index was 29.0 kg/m² (IQR = 8.0), ranging from 18.8 to 44.0 kg/m². Obesity, defined as a BMI exceeding 30 kg/m², was present in 65 patients (43.3%). When participants were classified according to nutritional status, overweight emerged as the most common category, affecting 55 individuals (36.7%). Both normal weight and class I obesity were observed in 36 participants each (24.0%). Furthermore, an enlarged waist circumference, a marker of central adiposity associated with increased cardiometabolic risk, was identified in 69 patients (46.0%). A positive cardiovascular family history was documented in 16 participants (10.7%). Evaluation of coexisting medical conditions demonstrated that arterial hypertension was the most prevalent comorbidity, affecting 102 patients (68.0%). Type 2 diabetes mellitus was recorded in 37 individuals (24.7%), reflecting the substantial burden of metabolic disorders among patients with acute myocardial infarction. Previous ischemic stroke was reported by 15 patients (10.0%), while peripheral arterial disease was identified in 7 cases (4.7%). Regarding prior cardiovascular events, a history of myocardial infarction was noted in 15 patients (10.0%), and chronic heart failure was present in 42 patients (28.0%). These findings underscore the high prevalence of cardiovascular comorbidities and risk factors within the study cohort.
A detailed overview of the cardiovascular risk factors and comorbid conditions identified in the study population is provided in Table 1.
Within the overall study population, ST-segment elevation myocardial infarction represented the most common clinical presentation, being observed in 104 patients (69.3%), whereas non-ST-segment elevation myocardial infarction was identified in 46 patients (30.7%). Based on the etiological classification, type 1 myocardial infarction predominated, accounting for 147 cases (98.0%). Periprocedural myocardial infarctions were uncommon, with one patient (0.7%) classified as type 4 MI and two patients (1.3%) as type 5 AMI. Cardiogenic shock was documented in 10 patients (6.7%).
Analysis of infarct localization demonstrated that anterior wall involvement was the most frequently encountered pattern, affecting 74 patients (49.3%). Inferior wall infarction was recorded in 61 patients (40.7%), while apical involvement was less common and was observed in 15 cases (10.0%). Consistent with these findings, the left anterior descending artery was the vessel most often identified as responsible for the acute event, being implicated in 65 patients (43.3%). The right coronary artery was the culprit vessel in 53 cases (35.3%), whereas the left circumflex artery was involved in 27 patients (18.0%). Left main coronary artery disease was infrequent and was detected in only 5 patients (3.3%). Multivessel coronary artery disease was present in 125 patients (83.3%), indicating a high burden of coronary atherosclerosis across the cohort.
The median SYNTAX score was 28.0 (IQR = 12.0), with values ranging from 14.0 to 44.0. According to SYNTAX score categories, 37 patients (24.7%) were classified as having low anatomical complexity, 56 patients (37.3%) had intermediate complexity, and 57 patients (38.0%) were assigned to the high-complexity category. Overall, intermediate and high SYNTAX scores were identified in the majority of patients.
Regarding reperfusion therapy, 89 patients (59.3%) underwent revascularization during the initial angiographic procedure. Complete revascularization was achieved in 48 patients (32.0%). Concerning treatment delay, 13 patients (8.7%) were revascularized within the first 3 hours after symptom onset, a further 13 patients (8.7%) underwent intervention between 3 and 6 hours, and 22 patients (14.7%) received revascularization after more than 6 hours. Delayed presentation was reported in 102 patients (68.0%).
At the time of admission, several classes of cardiovascular medications were already being used by study participants. ACE inhibitors or angiotensin receptor blockers were prescribed to 69 patients (46.0%), beta-blockers to 46 patients (30.7%), and combined beta-blocker plus ACE inhibitor/ARB therapy to 29 patients (19.3%). Statin treatment was recorded in 46 patients (30.7%), while diuretics were administered to 33 patients (22.0%). SGLT2 inhibitors and antiplatelet agents were used by 34 (22.7%) and 32 patients (21.3%), respectively. Other medications, including calcium channel blockers and gastroprotective agents, were reported in 29 (19.3%) and 33 patients (22.0%), respectively.

3.2. Development of an Artificial Intelligence Model for Predicting New-Onset atrial Fibrillation Following Acute Myocardial Infarction

To identify potential predictors, variables related to the cardiovascular profile and cardiovascular risk factors were evaluated alongside serological and echocardiographic parameters, previous cardiovascular medication use and treatment dosage, and clinical and angiographic features of acute myocardial infarction, including the number of diseased coronary vessels, the culprit artery, the extent of coronary artery involvement, and the location of myocardial injury. The prediction model was developed using data from 150 participants, with no missing data recorded.
The predictive model was evaluated on an independent test dataset comprising 30 patients (20% of the total cohort of 150 participants). The confusion matrix demonstrated excellent discriminative performance. Among the 15 patients with sinus rhythm, 14 were correctly classified, while only one patient was incorrectly predicted as having atrial fibrillation. Conversely, all 15 patients with atrial fibrillation were correctly identified by the model, with no false-negative classifications. The overall classification accuracy reached 97.0%, with 29 of the 30 test cases correctly classified. For the sinus rhythm class, the model achieved a precision of 1.00, a recall of 0.93, and an F1-score of 0.97. For atrial fibrillation, precision was 0.94, recall reached 1.00, and the corresponding F1-score was 0.97. The macro-average and weighted-average values for precision, recall, and F1-score were all 0.97, indicating balanced performance across both outcome categories (Figure 1).
The discriminatory performance of the prediction model was further evaluated using receiver operating characteristic (ROC) curve analysis. The model achieved an area under the ROC curve (AUC) of 0.991, substantially exceeding the reference line corresponding to random classification (AUC = 0.500). This finding indicates an excellent ability of the model to distinguish between patients who developed atrial fibrillation and those who remained in sinus rhythm.
Visual inspection of the ROC curve demonstrated that the model consistently maintained high sensitivity across a broad range of specificity thresholds, with the curve closely approaching the upper-left corner of the graph. Such a pattern is characteristic of highly accurate classification models and reflects a very low rate of misclassification.
The ROC analysis is consistent with the results obtained from the confusion matrix and classification report, which showed an overall accuracy of 97.0% and only one incorrectly classified patient in the test dataset (Figure 2).
The performance of the prediction model was additionally assessed using a precision–recall (PR) curve. The model achieved an F1-score of 0.968 and an AUC of 0.991, indicating excellent predictive performance. The PR curve remained markedly above the no-skill reference line throughout almost the entire range of recall values, demonstrating a consistently high precision in identifying patients with atrial fibrillation.
Notably, precision remained close to 1.00 across a wide spectrum of recall values, suggesting that the majority of patients classified as having atrial fibrillation were correctly identified. This finding highlights the model’s ability to maintain a favorable balance between sensitivity and positive predictive value, minimizing both false-negative and false-positive classifications.
The results of the precision–recall analysis are consistent with those obtained from the ROC curve and confusion matrix, which demonstrated an overall accuracy of 97.0%, an F1-score of 0.97, and only one misclassified patient in the test dataset (Figure 3).
Figure 4 shows the SHAP summary plot, providing an overview of the relative importance of the predictors and their influence on the model output for new-onset atrial fibrillation. SHAP analysis was employed to enhance model interpretability by quantifying the contribution of each predictor to the final model output. The variables are ordered according to their mean absolute SHAP value, reflecting their overall importance in the prediction process. Each point represents an individual observation, while the position along the x-axis indicates the SHAP value, corresponding to the magnitude and direction of the variable's effect on the predicted outcome. Positive SHAP values shift the prediction toward a higher probability of the outcome, whereas negative values decrease the predicted probability. The color gradient denotes the actual feature value, with blue representing lower values and red representing higher values.
The analysis revealed that the E/e′ ratio was the most influential predictor in the model. Higher E/e′ values were predominantly associated with positive SHAP values, indicating that elevated left ventricular filling pressures contributed substantially to an increased predicted risk. The second most important variable was left ventricular mass index, which also demonstrated a marked impact on model predictions, with lower values tending to be associated with negative SHAP values and higher values generally shifting predictions toward increased risk. From other echocardiographic parameters which played an important role was the left atrial volume indexed to body surface area, with larger values generally contributing to higher model outputs. Left ventricular ejection fraction was another key determinant, where lower ejection fraction values were associated with increased risk, while preserved systolic function tended to reduce the predicted probability of the adverse outcome. In addition, ventricular dimensions, represented by left ventricle end-diastolic diameter and left ventricle end-diastolic volume, contributed to model performance, although with lower overall importance compared with the leading variables.
Among the biochemical markers, superoxide dismutase and malondialdehyde emerged as major contributors. Elevated malondialdehyde values, reflecting increased oxidative stress and lipid peroxidation, were associated with positive SHAP values and therefore a higher predicted risk. Conversely, higher superoxide dismutase levels tended to be associated with lower SHAP values, suggesting a protective effect of greater antioxidant capacity. Higher levels of NT-proBNP was associated with increased predicted risk. Likewise, an elevated neutrophil-to-lymphocyte ratio shifted predictions toward higher risk values.
Clinical characteristics also influenced model predictions. Complete revascularization was identified as one of the strongest protective factors, as patients who underwent complete revascularization generally exhibited negative SHAP values, indicating a reduction in predicted risk. Similarly, the presence of beta-blocker therapy and SGLT2 inhibitor treatment tended to be associated with lower model outputs, supporting their beneficial role. Traditional cardiovascular risk factors retained predictive relevance but exerted a smaller effect than the leading variables. Age, hypertension, and a history of previous myocardial infarction contributed to the model output, although their impact was less pronounced. The SYNTAX score, reflecting the complexity of coronary artery disease, also influenced predictions, with higher scores generally associated with increased risk.
To further investigate the relative contribution of individual predictors to the model's performance, a permutation feature importance analysis was conducted. This method quantifies the importance of each variable by measuring the decrease in predictive accuracy following random permutation of its values, while all other features remain unchanged. Variables associated with larger decreases in model performance are considered more influential for prediction.
The permutation importance analysis identified was left ventricular mass index as the most influential predictor, exhibiting the largest median decrease in model accuracy after permutation. This finding indicates that the model relies heavily on this variable for generating accurate predictions. Other highly influential predictors included the E/e′ ratio, left ventricular ejection fraction, and thyroid-stimulating hormone levels, all of which demonstrated substantial reductions in predictive performance when their information was disrupted.
Among the biochemical parameters, markers of oxidative stress, namely superoxide dismutase and malondialdehyde, ranked among the most important variables, suggesting a significant contribution of oxidative stress-related mechanisms to the predictive process. Structural and functional echocardiographic parameters, including left atrial volume indexed to body surface area and left ventricular end-diastolic diameter, also showed meaningful importance scores, highlighting the relevance of ventricular remodeling and cardiac function.
Clinical characteristics such as complete revascularization, neutrophil-to-lymphocyte ratio, left ventricular end-diastolic volume, hypertension, NT-proBNP, and SYNTAX score demonstrated moderate predictive contributions. In contrast, variables including troponin I, culprit vessel, infarct localization, multivessel disease, Killip class, and several other conventional clinical parameters displayed near-zero importance values, indicating that their inclusion provided little additional predictive information beyond that already captured by the most influential predictors. The relatively narrow interquartile ranges observed for the highest-ranking variables suggest good stability and reproducibility of the importance estimates across repeated permutations, supporting the robustness of the identified predictor hierarchy (Figure 5).

4. Discussion

The present study developed and validated an explainable artificial intelligence model for predicting new-onset atrial fibrillation in patients with acute myocardial infarction. Our model demonstrated a good predictive performance, with an overall accuracy of 97.0%, an F1-score of 0.97, and an AUC of 0.991. These results compare favorably with previously published prediction models for new-onset atrial fibrillation after AMI, which generally reported AUC values ranging from 0.75 to 0.85 [15,16]. Wu et al. reported an AUC of approximately 0.78 in a model incorporating age, inflammatory markers, and metabolic indices [17], while Zhang et al. developed a nomogram-based model with moderate discriminative ability [9]. Similarly, Wang et al. demonstrated the predictive value of electrocardiographic parameters such as P-wave amplitude in lead V1, although the performance remained lower than that observed in the present study [18] .
One of findings of our study was the dominant contribution of echocardiographic parameters. According to SHAP analysis, the E/e′ ratio emerged as the strongest predictor of new-onset atrial fibrillation. This observation is biologically plausible because elevated E/e′ reflects increased left ventricular filling pressure, a recognized mechanism promoting left atrial remodeling and atrial fibrillation development [19]. Previous studies have consistently demonstrated that diastolic dysfunction is strongly associated with AF occurrence in patients with cardiovascular disease and following myocardial infarction [20].
Left ventricular mass index was one of the most influential predictor in the permutation importance analysis and one of the leading predictors in the SHAP model. Increased left ventricular mass index reflects chronic myocardial remodeling and hypertrophy, both of which are associated with myocardial fibrosis and atrial structural changes that facilitate AF development . Similar associations between left ventricle hypertrophy and incident AF have been reported in several contemporary studies [21].
Another major echocardiographic determinant was indexed left atrial volume. Left atrial enlargement is considered one of the strongest markers of atrial remodeling and has consistently been associated with both incident and recurrent AF [22]. The importance of this variable in our model further supports the hypothesis that structural atrial abnormalities contribute substantially to the development of new-onset AF after AMI.
Reduced left ventricular ejection fraction also significantly influenced model predictions. Previous investigations have shown that impaired systolic function promotes neurohormonal activation, myocardial fibrosis, and atrial pressure overload, thereby increasing susceptibility to atrial arrhythmias [23]. Our findings are consistent with these observations and reinforce the role of ventricular dysfunction in post-infarction arrhythmogenesis.
An interesting finding of the present study is the prominent role of oxidative stress biomarkers. Malondialdehyde was associated with increased predicted risk, whereas superoxide dismutase exhibited a protective effect. Growing evidence indicates that oxidative stress contributes directly to atrial electrical and structural remodeling through inflammation, mitochondrial dysfunction, calcium-handling abnormalities, and fibrosis [24]. Korantzopoulos et al. and Sagris et al. demonstrated that oxidative stress markers are significantly elevated in patients with atrial fibrillation and may contribute to disease progression [25]. Our results suggest that oxidative stress may also play an important role in new-onset AF development after AMI.
Inflammation appears to represent another important mechanism. In the current model, higher neutrophil-to-lymphocyte ratio was associated with increased AF risk. This finding is supported by previous studies demonstrating that inflammatory activation contributes to atrial fibrosis and electrical instability [26]. Elevated neutrophil-to-lymphocyte ratio has previously been identified as an independent predictor of new-onset AF in AMI populations .
Among biochemical variables, NT-proBNP was associated with increased risk of new-onset AF. Elevated natriuretic peptide levels reflect myocardial wall stress and have repeatedly been linked to both incident AF and adverse cardiovascular outcomes . The inclusion of NT-proBNP among the most important predictors supports previous evidence highlighting its value for risk stratification. Clinical and therapeutic variables also demonstrated substantial influence. Complete revascularization emerged as one of the strongest protective factors in the model. Previous studies have demonstrated that successful revascularization reduces residual ischemia, adverse remodeling, and recurrent cardiovascular events [27]. Our findings suggest that complete revascularization may additionally reduce the risk of developing new-onset AF. Similarly, beta-blocker therapy was associated with lower predicted risk. This finding is consistent with current evidence supporting beta-blockers as cornerstone therapies for reducing post-infarction arrhythmias through sympathetic modulation [28]. SGLT2 inhibitor treatment also appeared protective, which is in line with recent studies suggesting antiarrhythmic and cardioprotective effects mediated through reductions in inflammation, oxidative stress, and myocardial remodeling [29].
One of strength of the present study is the application of explainable artificial intelligence techniques. The use of SHAP analysis enabled identification of the variables driving model predictions and addressed one of the principal limitations of machine-learning approaches, namely the lack of interpretability. Similar explainable artificial intelligence approaches have recently been employed in cardiovascular medicine for predicting mortality after myocardial infarction, recurrent atrial fibrillation, and cancer therapy-related cardiac dysfunction [30,31].

5. Conclusions

The development of this machine learning model aimed to improve the identification of patients at increased risk by integrating a wide range of clinical, echocardiographic, angiographic, and biochemical parameters. By simultaneously analyzing multiple variables and their interactions, the model provides a more comprehensive assessment than traditional approaches based on a limited number of predictors. The interpretability analyses showed that markers of cardiac structure and function, oxidative stress, and treatment-related variables played a major role in determining the predicted risk. These findings are consistent with current evidence regarding the multifactorial mechanisms involved in cardiovascular disease progression and support the biological plausibility of the model. An important advantage of the proposed model is its ability to identify patients who may be at higher risk despite apparently similar clinical characteristics. This could facilitate earlier recognition of vulnerable individuals, closer monitoring, and a more personalized therapeutic approach. In addition, the results highlight the potential value of combining conventional cardiovascular risk markers with newer biomarkers and advanced imaging parameters to improve risk prediction.
Overall, our findings suggest that machine learning techniques may represent a useful complementary tool for risk stratification in clinical practice. Although further validation in larger external cohorts is necessary, the present model provides a promising framework for supporting clinical decision-making and improving individualized patient management.

6. Limitations

This study has several limitations that should be considered when interpreting the findings. Although the study was conducted prospectively, it was performed in a single tertiary cardiology center and included a relatively small sample of 150 patients. The limited sample size may restrict the generalizability of the results and increase the risk of model overfitting, particularly given the large number of candidate predictors evaluated.
The machine learning model was developed and tested using data derived from the same institutional cohort. Although an independent hold-out test set was used, external validation in geographically and clinically distinct populations was not performed. Therefore, the reported predictive performance may not fully reflect real-world performance in other healthcare settings. In addition, the test dataset consisted of only 30 patients, resulting in a limited number of outcome events available for model evaluation. Consequently, performance metrics such as accuracy, ROC-AUC, and precision–recall AUC may be unstable and potentially optimistic. The very high discrimination observed should therefore be interpreted with caution until confirmed in larger cohorts. Another limitation is that the study design included equal numbers of patients with and without new-onset atrial fibrillation. While this approach facilitated model development, it does not reflect the true incidence of atrial fibrillation after acute myocardial infarction and may influence model calibration when applied in routine clinical practice. Furthermore, several potentially relevant predictors were not systematically assessed, including advanced cardiac magnetic resonance imaging parameters, atrial strain measurements, genetic markers, and longitudinal biomarker trajectories. The inclusion of such variables may further improve predictive performance. The proposed machine learning model was also not directly compared with established clinical risk prediction tools or conventional statistical models. Future studies should evaluate whether the observed predictive gain translates into clinically meaningful improvements beyond existing risk stratification approaches.
Despite these limitations, the study provides preliminary evidence supporting the potential utility of machine learning techniques for identifying patients at increased risk of new-onset atrial fibrillation following acute myocardial infarction. Larger multicenter studies with external validation are required before clinical implementation can be recommended.

Author Contributions

All authors contributed equally to the conception, design, data analysis, manuscript preparation, and revision of this study. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

This study was conducted in accordance with the Declaration of Helsinki and approved by the Ethics Committee of the State University of Medicine and Pharmacy “Nicolae Testemitanu” (session dated 12 April 2018, approval No. 54).

Data Availability Statement

The data presented in this study are available on request from the corresponding author.

Acknowledgments

The authors thank their department for the support in this research.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

AF atrial fibrillation
AMI acute myocardial infarction
SHAP SHapley Additive Explanations

References

  1. Yang, W.Y.; Lip, G.Y.H.; Sun, Z.J.; Peng, H.; Fawzy, A.M.; Li, H.W. Implications of new-onset atrial fibrillation on in-hospital and long-term prognosis of patients with acute myocardial infarction: A report from the CBD bank study. Front Cardiovasc Med. 2022, 9. [Google Scholar] [CrossRef] [PubMed]
  2. Cazacu, J.; Lîsîi, D.; Priscu, O.; Bursacovschi, D.; Dodu, S.; Guțan, I.; et al. PROGNOSTIC FACTORS INFLUENCING ALL-CAUSE MORTALITY AND HOSPITALIZATIONS AFTER INPATIENT CARDIAC REHABILITATION FOR ACUTE CORONARY EVENTS. Arch. Balk. Med. Union 2024, 59(2). [Google Scholar] [CrossRef]
  3. Jin, Y.; Wang, K.; Xiao, B.; Wang, M.; Gao, X.; Zhang, J.; et al. Global burden of atrial fibrillation/flutter due to high systolic blood pressure from 1990 to 2019: estimates from the global burden of disease study 2019. J. Clin Hypertens. 2022, 24(11). [Google Scholar] [CrossRef] [PubMed]
  4. Kornej, J.; Börschel, C.S.; Benjamin, E.J.; Schnabel, R.B. Epidemiology of Atrial Fibrillation in the 21st Century. Circ. Res. 2020, 127(1). [Google Scholar] [CrossRef] [PubMed]
  5. Lee, J.H.; Kim, S.H.; Lee, W.; Cho, Y.; Kang, S.H.; Park, J.J.; et al. New-onset paroxysmal atrial fibrillation in acute myocardial infarction: increased risk of stroke. BMJ Open. 2020, 10(9). [Google Scholar] [CrossRef] [PubMed]
  6. Aronson, D.; Boulos, M.; Suleiman, A.; Bidoosi, S.; Agmon, Y.; Kapeliovich, M.; et al. Relation of C-Reactive Protein and New-Onset Atrial Fibrillation in Patients With Acute Myocardial Infarction. Am. J. Cardiol. 2007, 100(5). [Google Scholar] [CrossRef] [PubMed]
  7. Stamboul, K.; Zeller, M.; Fauchier, L.; Gudjoncik, A.; Buffet, P.; Garnier, F.; et al. Incidence and prognostic significance of silent atrial fibrillation in acute myocardial infarction. Int. J. Cardiol. 2014, 174(3). [Google Scholar] [CrossRef] [PubMed]
  8. Wi, J.; Shin, D.H.; Kim, J.S.; Kim, B.K.; Ko, Y.G.; Choi, D.; et al. Transient new-onset atrial fibrillation is associated with poor clinical outcomes in patients with acute myocardial infarction. Circ. J. 2016, 80(7). [Google Scholar] [CrossRef] [PubMed]
  9. Wang, Z.; Bao, W.; Cai, D.; Hu, M.; Gao, X.; Li, C. Construction of a predictive model for new-onset atrial fibrillation after acute myocardial infarction based on P-wave amplitude in lead V1. J. Electrocardiol. 2024, 83. [Google Scholar] [CrossRef] [PubMed]
  10. Zhang, L.X.; Cao, J.Y.; Zhou, X.J. Construction and validation of a nomogram prediction model for the risk of new-onset atrial fibrillation following percutaneous coronary intervention in acute myocardial infarction patients. BMC Cardiovasc Disord. 2024, 24(1). [Google Scholar] [CrossRef] [PubMed]
  11. Jeong, J.H.; Lee, K.S.; Park, S.M.; Kim, S.R.; Kim, M.N.; Chae, S.C.; et al. Prediction of longitudinal clinical outcomes after acute myocardial infarction using a dynamic machine learning algorithm. Front Cardiovasc Med. 2024, 11. [Google Scholar] [CrossRef] [PubMed]
  12. Lundberg, S.M.; Lee, S.I. A unified approach to interpreting model predictions. Advances in Neural Information Processing Systems 2017. [Google Scholar]
  13. Pramudito, M.A.; Fuadah, Y.N.; Qauli, A.I.; Marcellinus, A.; Lim, K.M. Explainable artificial intelligence (XAI) to find optimal in-silico biomarkers for cardiac drug toxicity evaluation. Sci. Rep. 2024, 14(1). [Google Scholar] [CrossRef] [PubMed]
  14. Bursacovschi, D.; Arnaut, O.; Ochisor, V.; Mihalache, G.; Baltaga, R.; Iacomi, V.; et al. Integrated Model for Predicting Cancer Therapy-Related Cardiac Dysfunction in Non-Hodgkin Lymphoma. Biomedicines 2025, 13(12). [Google Scholar] [CrossRef] [PubMed]
  15. He, X.; Wang, J.; Ye, L.; Xu, L.; Gao, J. Effects of Cardiac Rehabilitation on Cardiac Function and Cardiovascular Adverse Events in Coronary Heart Disease Patients Following Percutaneous Coronary Intervention: A Systematic Review and Meta-analysis of Randomized Controlled Trials. Rev. Cardiovasc. Med. 2025. [Google Scholar] [CrossRef] [PubMed]
  16. Carrozzo, A.; Cappucci, I.P.; Basile, L.; Tremoli, E.; Zavan, B.; Ferroni, L. Extracellular vesicles in cardiac surgery: unlocking new frontiers in cardioprotection and patient outcomes. Clinical and Experimental Medicine 2026. [Google Scholar] [CrossRef] [PubMed]
  17. Wu, X.D.; Zhao, W.; Wang, Q.W.; Yang, X.Y.; Wang, J.Y.; Yan, S.; et al. Clinical predictive model of new-onset atrial fibrillation in patients with acute myocardial infarction after percutaneous coronary intervention. Sci. Rep. 2025, 15(1). [Google Scholar] [CrossRef] [PubMed]
  18. Waller, A.H.; Horgan, S.; Groarke, J.D.; Valente, A.M.; Koplan, B.A.; Blankstein, R. Integration of cardiac magnetic resonance imaging in pre-procedural planning and electroanatomical mapping for catheter ablation after a Fontan-Bjork correction of tricuspid atresia. Eur. Heart J. Cardiovasc. Imaging 2014. [Google Scholar] [CrossRef] [PubMed]
  19. Nagueh, S.F.; Smiseth, O.A.; Appleton, C.P.; Byrd, B.F.; Dokainish, H.; Edvardsen, T.; et al. Recommendations for the Evaluation of Left Ventricular Diastolic Function by Echocardiography: An Update from the American Society of Echocardiography and the European Association of Cardiovascular Imaging. J. Am. Soc. Echocardiogr. 2016, 29(4). [Google Scholar] [CrossRef] [PubMed]
  20. Al-Mohaissen, M.; Lee, T.; Qattea, M.B. Longitudinal Changes in Diastolic Dysfunction in Heart Failure with Reduced Ejection Fraction: Clinical and Echocardiographic Associations. CJC Open. 2025, 7(12). [Google Scholar] [CrossRef] [PubMed]
  21. Sposato, L.A.; Field, T.S.; Schnabel, R.B.; Wachter, R.; Andrade, J.G.; Hill, M.D. Towards a new classification of atrial fibrillation detected after a stroke or a transient ischaemic attack. Lancet Neurol. 2024. [Google Scholar] [CrossRef] [PubMed]
  22. Doehner, W.; Boriani, G.; Potpara, T.; Blomstrom-Lundqvist, C.; Passman, R.; Sposato, L.A.; et al. Atrial fibrillation burden in clinical practice, research, and technology development: a clinical consensus statement of the European Society of Cardiology Council on Stroke and the European Heart Rhythm Association. Europace 2025, 27(3). [Google Scholar] [CrossRef] [PubMed]
  23. Heidenreich, P.A.; Bozkurt, B.; Aguilar, D.; Allen, L.A.; Byun, J.J.; Colvin, M.M.; et al. 2022 AHA/ACC/HFSA Guideline for the Management of Heart Failure: A Report of the American College of Cardiology/American Heart Association Joint Committee on Clinical Practice Guidelines. Circulation 2022. [Google Scholar] [CrossRef] [PubMed]
  24. Korantzopoulos, P.; Letsas, K.P.; Tse, G.; Fragakis, N.; Goudis, C.A.; Liu, T. Inflammation and atrial fibrillation: A comprehensive review. J. Arrhythmia. 2018. [Google Scholar] [CrossRef] [PubMed]
  25. Sagris, M.; Vardas, E.P.; Theofilis, P.; Antonopoulos, A.S.; Oikonomou, E.; Tousoulis, D. Atrial fibrillation: Pathogenesis, predisposing factors, and genetics. Int. J. Mol. Sci. 2022. [Google Scholar] [CrossRef] [PubMed]
  26. Dobrev, D.; Heijman, J.; Hiram, R.; Li, N.; Nattel, S. Inflammatory signalling in atrial cardiomyocytes: a novel unifying principle in atrial fibrillation pathophysiology. Nat. Rev. Cardiol. 2023. [Google Scholar] [CrossRef] [PubMed]
  27. Svennberg, E.; Merino, J.L.; Andrade, J.; Anselmino, M.; Arbelo, E.; Boersma, E.; et al. Transforming atrial fibrillation management by targeting comorbidities and reducing atrial fibrillation burden: The 10th AFNET/EHRA consensus conference. Europace. 2025, 27(12). [Google Scholar] [CrossRef] [PubMed]
  28. Yadava, O.P.; Narayan, P.; Padmanabhan, C.; Sajja, L.R.; Sarkar, K.; Varma, P.K.; et al. IACTS position statement on “2021 ACC/AHA/SCAI Guideline for Coronary Artery Revascularization”: section 7.1—a consensus document. Indian J. Thorac. Cardiovasc Surg. 2022, 38(2). [Google Scholar] [CrossRef] [PubMed]
  29. Reddy, Y.N.V.; Borlaug, B.A.; Gersh, B.J. Management of Atrial Fibrillation Across the Spectrum of Heart Failure with Preserved and Reduced Ejection Fraction. Circulation. 2022. [Google Scholar] [CrossRef] [PubMed]
  30. Palermi, S.; Vecchiato, M.; Ng, F.S.; Attia, Z.; Cho, Y.; Anselmino, M.; et al. Artificial intelligence and the electrocardiogram: A modern renaissance. Eur. J. Intern. Med. 2025. [Google Scholar] [CrossRef] [PubMed]
  31. Fahim, Y.A.; Hasani, I.W.; Kabba, S.; Ragab, W.M. Artificial intelligence in healthcare and medicine: clinical applications, therapeutic advances, and future perspectives. Eur. J. Med. Res. 2025, 30(1). [Google Scholar] [CrossRef] [PubMed]
Figure 1. Evaluation of the predictive performance of the model for detecting new-onset atrial fibrillation.
Figure 1. Evaluation of the predictive performance of the model for detecting new-onset atrial fibrillation.
Preprints 221135 g001
Figure 2. Predictive accuracy of the machine learning model for new-onset atrial fibrillation.
Figure 2. Predictive accuracy of the machine learning model for new-onset atrial fibrillation.
Preprints 221135 g002
Figure 3. Precision–recall curve for prediction of new-onset atrial fibrillation.
Figure 3. Precision–recall curve for prediction of new-onset atrial fibrillation.
Preprints 221135 g003
Figure 4. SHAP summary plot of feature importance for predicting new-onset atrial fibrillation.
Figure 4. SHAP summary plot of feature importance for predicting new-onset atrial fibrillation.
Preprints 221135 g004
Figure 5. Variable Importance for New-Onset Atrial Fibrillation Prediction.
Figure 5. Variable Importance for New-Onset Atrial Fibrillation Prediction.
Preprints 221135 g005
Table 1. Baseline demographic and clinical characteristics.
Table 1. Baseline demographic and clinical characteristics.
Total cohort, n=150 (%) CI 95%
Previous stroke 15 (10.0%) 5.2%, 15%
Peripheral artery disease 7 (4.7%) 1.3%, 8.0%
Smoking 49 (32.7%) 25%, 40%
Alcohol consumption 43 (28.7%) 21%, 36%
Arterial hypertension 102 (68.0%) 61%, 75%
Hypertension grade
  Absence 48 (32.0%) 25%, 39%
  Grade I 3 (2.0%) 0.00%, 4.2%
  Grade II 52 (34.7%) 27%, 42%
  Grade III 47 (31.3%) 24%, 39%
Previous myocardial infarction 15 (10.0%) 5.2%, 15%
Chronic heart failure 42 (28.0%) 21%, 35%
Type 2 diabetes mellitus 37 (24.7%) 18%, 32%
Anemia 28 (18.7%) 12%, 25%
Dyslipidemia 124 (82.7%) 77%, 89%
Metabolic syndrome 36 (24.0%) 17%, 31%
Chronic kidney disease 21 (14.0%) 8.4%, 20%
Chronic obstructive pulmonary disease 8 (5.3%) 1.7%, 8.9%
Thyroid disease 7 (4.7%) 1.3%, 8.0%
Malignant neoplastic disease 3 (2.0%) 0.0%, 4.2%
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

© 2026 MDPI (Basel, Switzerland) unless otherwise stated

Accessibility

Disclaimer

Terms of Use

Privacy Policy

Privacy Settings