Submitted:
26 May 2026
Posted:
27 May 2026
You are already at the latest version
Abstract
Keywords:
1. Introduction
1.1. Overview
1.2. Literature Review
1.2.1. Traditional Early Warning Systems for Banking Crises
1.2.2. Machine Learning Methods for Banking Crisis Prediction
1.3. Contributions and Research Highlights
- Three machine learning models, Random Forest, XGBoost, and SVM, were assessed and compared to logistic regression to forecast financial crises in Nigeria.
- XGBoost came in second to Random Forest, which performed the best during cross-validation on the training data with high F1-scores, recall, and ROC-AUC values.
- XGBoost demonstrated its durability on unseen data by achieving the best overall performance on the test set, with the greatest accuracy and crisis-class F1-score.
- Exchange rate volatility, inflation, systemic crises, and sovereign debt defaults are important macro-financial indicators that underscore the role of economic fundamentals at the onset of crises.
- Leveraged estimated crisis probability to build a workable Early Warning System (EWS) that correctly recognised every past crisis inside the Red alert zone, demonstrating the policy’s applicability.
- Provides policymakers with advice on model selection and risk management, and empirical evidence to justify the use of machine learning models for crisis forecasting in Nigeria.
2. Methods
2.1. Data Description and Preprocessing
2.2. Feature Selection
2.3. Model Development
2.3.1. Logistic Regression
2.3.2. Random Forest
2.3.3. Support Vector Machines
2.3.4. Extreme Gradient Boosting
2.4. Model Evaluation and Interpretability
3. Results
3.1. Exploratory Data Analysis

3.2. Feature Importance
3.3. Model Performance
3.3.1. Cross-Validation
3.3.2. Test Set Evaluation
3.4. ROC and Precision–Recall Analysis
3.5. Early Warning System Outputs

3.6. Summary of Key Findings
- 1.
- Tree-based ensemble models, especially random forest and XGBoost, always perform better in classifying historical banking crisis episodes than logistic regression and Support Vector Machines, which suggests that they can discern nonlinear macro-financial relationships.
- 2.
- Factors found to be significant predictors across a variety of model specifications include exchange rate volatility, inflation, systemic crisis variables, and sovereign external debt default events, highlighting the role of macroeconomic instability in the Nigerian banking sector crisis.
- 3.
- As the data are measured annually, the model is most appropriate for detecting crisis periods simultaneously. The probability of a crisis is projected to be high in a crisis year, indicating no predictive ability.
- 4.
- The results should be viewed as indicative of above-average performance, instead of clear evidence of a functioning early warning system, although the devised framework is capable of defining preceding crisis cases accurately.
- 5.
- Overall, the results suggest that machine learning approaches can be combined with traditional econometric approaches to provide effective early warning of banking crisis risk in Nigeria. However, their application in practice depends on the availability of more extensive data, a clearer definition of lead-time performance, and improved interpretability.
4. Discussion
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
| EWS | Early Warning System |
| ML | Machine Learning |
| SVM | Support Vector Machine |
| RF | Random Forest |
| XGBoost | Extreme Gradient Boosting |
| GDP | Gross Domestic Product |
| CPI | Consumer Price Index |
| ROC | Receiver Operating Characteristic |
| AUC | Area Under the Curve |
| SMOTE | Synthetic Minority Oversampling Technique |
| KNN | K-Nearest Neighbours |
| IMF | International Monetary Fund |
References
- Reinhart, C.M. This Time Is Different: Eight Centuries of Financial Folly; Princeton University Press: Princeton, NJ, USA, 2009. [CrossRef]
- Laeven, L.; Valencia, F. Systemic Banking Crises Revisited; International Monetary Fund: Washington, DC, USA, 2018. [CrossRef]
- Peychev, A. Reforming the European Stability Mechanism: Too Much but Never Enough; Wilfried Martens Centre for European Studies, 2021. [CrossRef]
- Chen, T.; Guestrin, C. XGBoost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2016, pp. 785–794. [CrossRef]
- Claessens, S.; Kose, A.; Laeven, L.; Valencia, F. Understanding Financial Crises: Causes, Consequences, and Policy Responses; International Monetary Fund: Washington, DC, USA, 2014. [CrossRef]
- Reimann, C. Predicting financial crises: An evaluation of machine learning algorithms and model explainability for early warning systems. Rev. Evol. Political Econ. 2024, 5, 51–83. [CrossRef]
- Voskamp, J. Machine Learning for Financial Crisis Prediction; Ph.D. Thesis, 2024.
- Reinhart, C.M.; Rogoff, K.S.; Trebesch, C.; Reinhart, V.R. Global Crises Data by Country; Harvard Business School: Boston, MA, USA, 2019.
- Kaminsky, G.; Lizondo, S.; Reinhart, C.M. Leading indicators of currency crises. Staff Papers 1998, 45, 1–48. [CrossRef]
- Drehmann, M.; Juselius, M. Evaluating early warning indicators of banking crises: Satisfying policy requirements. Int. J. Forecast. 2014, 30, 759–780. [CrossRef]
- Betz, F.; Opricǎ, S.; Peltonen, T. A.; Sarlin, P. Predicting distress in European banks. J. Bank. Financ. 2014, 45, 225–241. [CrossRef]
- Langley, P. Elements of Machine Learning; Morgan Kaufmann: San Francisco, CA, USA, 1996.
- Beutel, J.; List, S.; von Schweinitz, G. Does machine learning help us predict banking crises? J. Financ. Stab. 2019, 45, 100693. [CrossRef]
- El Halabi, L. Predicting Banking Crises Using Machine Learning: The Case of Lebanon; Ph.D. Thesis, 2023.
- Laeven, L. Banking crises: A review. Annu. Rev. Financ. Econ. 2011, 3, 17–40. [CrossRef]
- Gogas, P.; Papadimitriou, T.; Agrapetidou, A. Forecasting bank failures and stress testing: A machine learning approach. Int. J. Forecast. 2018, 34, 440–455. [CrossRef]
- Bussmann, N.; Giudici, P.; Marinelli, D.; Papenbrock, J. Explainable machine learning in credit risk management. Comput. Econ. 2021, 57, 203–216. [CrossRef]
- Liu, L.; Chen, C.; Wang, B. Predicting Financial Crises with Machine Learning Methods; Journal of Forecasting, 41, 871–910, 2022. [CrossRef]
- Trickz, J. African Crises Dataset. Available online: https://www.kaggle.com/jaimetrickz/african-crises (accessed on 16 May 2025).
- Yin, S. Machine Learning Algorithms for Early Warning Systems: Predicting Systemic Financial Crises through Non-Linear Econometric Models; International Journal of Economics and Finance Studies, 16, 286–311, 2024. [CrossRef]






| Author/Year | Methodology Type | Key Features | Performance Metrics | Limitations |
|---|---|---|---|---|
| [9] | Traditional EWS: Logit/Probit | Signal extraction approach | False alarm rates | Linear assumptions; high false positives |
| [1] | Historical analysis | Crisis incidence analysis | Frequency analysis | Descriptive only; not predictive |
| [2,15] | Systemic crisis database | Crisis dating | Systemic risk indicators | Retrospective; limited prediction |
| [10] | Credit-to-GDP gap | Single-indicator EWS | Signal extraction | Country-specific limitations |
| [11] | Multivariate logit | Stress testing | AUROC, Calibration | Data requirements for emerging markets |
| [4] | XGBoost | Ensemble learning | Accuracy | Black-box nature |
| [16] | SVM, Random Forests | Kernel methods | AUROC | Limited emerging economy focus |
| [13] | ML vs Traditional | Comparative analysis | AUROC | Data constraints |
| [14] | Ensemble methods | Random Forests | F1-score, Accuracy | Interpretability challenges |
| [6] | XGBoost, Random Forests | Ensemble learning | AUROC, F1-score | European focus only |
| [17] | XAI with SHAP | Explainable AI | SHAP values | Accuracy-interpretability trade-off |
| [3] | Hybrid EWS | Bayesian methods | Predictive accuracy | Implementation complexity |
| Variable | Count | Mean | Std | Min | 25% | 50% | 75% | Max |
|---|---|---|---|---|---|---|---|---|
| case | 60 | 45.0 | 0.0 | 45.0 | 45.0 | 45.0 | 45.0 | 45.0 |
| year | 60 | 1983.88 | 17.88 | 1954.0 | 1968.75 | 1983.5 | 1999.25 | 2014.0 |
| systemic_crisis | 60 | 0.167 | 0.376 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 |
| exch_usd | 60 | 38.95 | 59.08 | 0.0 | 0.0 | 0.78 | 100.85 | 158.27 |
| domestic_debt_in_default | 60 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| sovereign_external_debt_default | 60 | 0.15 | 0.36 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 |
| gdp_weighted_default | 60 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| inflation_annual_cpi | 60 | 14.77 | 15.65 | -4.55 | 6.10 | 10.76 | 16.30 | 72.73 |
| independence | 60 | 0.9 | 0.30 | 0.0 | 1.0 | 1.0 | 1.0 | 1.0 |
| currency_crises | 60 | 0.167 | 0.376 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 |
| inflation_crises | 60 | 0.2 | 0.403 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 |
| banking_crisis | 60 | 0.183 | 0.390 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 |
| Model | Accuracy | F1-score | Precision | Recall | ROC-AUC |
|---|---|---|---|---|---|
| Logistic Regression | 0.7808 | 0.7569 | 0.8731 | 0.7393 | 0.9638 |
| Random Forest | 0.9233 | 0.9270 | 0.9200 | 0.9500 | 0.9953 |
| SVM | 0.5742 | 0.4682 | 0.4981 | 0.5000 | 0.8821 |
| XGBoost | 0.8717 | 0.8687 | 0.8950 | 0.8679 | 0.9828 |
| Model | Acc. | F1 (0) | F1 (1) | Prec. (1) | Rec. (1) | AUC |
|---|---|---|---|---|---|---|
| Logistic Regression | 0.83 | 0.89 | 0.67 | 0.50 | 1.00 | 1.0 |
| Random Forest | 0.75 | 0.82 | 0.57 | 0.40 | 1.00 | 1.0 |
| SVM | 0.83 | 0.89 | 0.67 | 0.50 | 1.00 | 0.0 |
| XGBoost | 0.92 | 0.95 | 0.80 | 0.67 | 1.00 | 1.0 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).