Submitted:
21 May 2026
Posted:
22 May 2026
You are already at the latest version
Abstract
Keywords:
1. Introduction
- A novel hybrid DeepSurv–LSTM architecture is developed for dynamic time-to-default credit risk modelling.
- Bayesian optimization is integrated to improve hyperparameter tuning and model generalization.
- The proposed framework combines survival analysis with temporal sequence learning to capture nonlinear and evolving borrower risk patterns.
- A comprehensive experimental evaluation is conducted using survival-specific and classification-based performance metrics.
- The study provides practical insights into the application of survival-aware deep learning models for financial risk management and credit default forecasting.
2. Literature Review
2.1. Traditional Credit Risk Modelling
2.2. Machine Learning Approaches in Credit Risk Prediction
2.3. Deep Learning Models for Financial Risk Prediction
2.4. Survival Analysis and Deep Survival Learning
2.5. Hybrid Deep Learning Frameworks and Research Gap
3. Methods and Materials
3.1. Overview of the Proposed Framework
3.2. Dataset Description
| Variable Category | Examples of Variables |
| Demographic variables | Age, marital status, employment |
| Financial variables | Income, debt ratio, liabilities |
| Behavioural variables | Repayment history, delinquency |
| Loan variables | Loan amount, duration, interest rate |
3.3. Survival-Time Construction
3.4. Feature Engineering and Temporal Sequence Construction
3.5. DeepSurv Survival Modelling
3.6. LSTM Temporal Learning Module
3.7. Hybrid DeepSurv–LSTM Architecture
3.8. Bayesian Hyperparameter Optimization
3.9. Experimental Design and Data Splitting
3.10. Model Evaluation Metrics
3. Results
4.1. Exploratory Data Analysis and Hyperparameter Tuning

4.2. Comparison and Interpretation of Model Performance
4. Discussion
4.1. Results Discussion
4.2. Practical and Policy Implications
5. Conclusions
6. Patents
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
Abbreviations
| AUC | Area Under the Curve |
| DeepSurv | Deep Survival Model |
| EDA | Exploratory Data Analysis |
| LD | Linear dichroism |
| FN | False Negative |
| FP | False Positive |
| IBS | Integrated Brier Score |
| LSTM | Long Short-Term Memory |
| ROC-AUC | Receiver Operating Characteristic–Area Under the Curve |
| SMOTE | Synthetic Minority Oversampling Technique |
| TN | True Negative |
| TP | True Positive |
| XGBoost | Extreme Gradient Boosting |
References
- Abi, R. Machine learning for credit scoring and loan default prediction using behavioral and transactional financial data. World Journal of Advanced Research and Reviews 2025, 26(3), 884–904. [Google Scholar] [CrossRef]
- Ahmad, A. Y.; Shukla, M.; Ali, G. AHNet: Design and Execution of Adaptive Hybrid Network for Credit Risk Prediction using Spatio-Temporal Attention-based Convolutional Autoencoder Features in the Banking Sector. Computational Economics 2026, 1–43. [Google Scholar] [CrossRef]
- Ahmed, F.; Iqbal, A. The role of artificial intelligence in enhancing credit risk management: A systematic literature review of international banking systems. Pakistan Journal of Humanities and Social Sciences 2025, 13(1), 478–492. [Google Scholar] [CrossRef]
- Amarnadh, V.; Moparthi, N. R. Comprehensive review of different artificial intelligence-based methods for credit risk assessment in data science. Intelligent Decision Technologies 2023, 17(4), 1265–1282. [Google Scholar] [CrossRef]
- Chang, V.; Sivakulasingam, S.; Wang, H.; Wong, S. T.; Ganatra, M. A.; Luo, J. Credit risk prediction using machine learning and deep learning: A study on credit card customers. Risks 2024, 12(11), 174. [Google Scholar] [CrossRef]
- Chen, T.; Guestrin, C. Xgboost: A scalable tree boosting system. In Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining; August 2016; pp. 785–794. [Google Scholar]
- Chhetria, E. S.; Parajulib, R.; Sharma, G. Credit risk prediction by using ensemble machine learning algorithms. Int. J. Res. Publ 2024, 147, 34–56. [Google Scholar]
- Dala, F. L.; Esquível, M. L.; Gaspar, R. M. Survival Analysis for Credit Risk: A Dynamic Approach for Basel IRB Compliance. Risks 2025, 13(8), 155. [Google Scholar] [CrossRef]
- D'Amato, A.; Mastrolia, E. Linear discriminant analysis and logistic regression for default probability prediction: the case of an Italian local bank. International Journal of Managerial and Financial Accounting 2022, 14(4), 323–343. [Google Scholar] [CrossRef]
- Dong, D.; Lin, B.; Dong, X. Logistics financial risk assessment based on decision tree algorithm model. Procedia Computer Science 2024, 243, 1095–1104. [Google Scholar] [CrossRef]
- Dugar, M.; Asesh, A. Deep Learning for Predicting Credit Card Default. In Machine Intelligence and Digital Interaction Conference; Cham; Springer Nature Switzerland, December 2023; pp. 87–94. [Google Scholar]
- Edunjobi, T. E.; Odejide, O. A. Theoretical frameworks in AI for credit risk assessment: Towards banking efficiency and accuracy. International Journal of Scientific Research Updates 2024, 7(01), 092–102. [Google Scholar] [CrossRef]
- Friedman, J. H. Greedy function approximation: a gradient boosting machine. Annals of statistics 2001, 1189–1232. [Google Scholar] [CrossRef]
- Han, X.; Yang, Y.; Chen, J.; Wang, M.; Zhou, M. Symmetry-aware credit risk modeling: A deep learning framework exploiting financial data balance and invariance. Symmetry 2025, 17(3), 341. [Google Scholar] [CrossRef]
- Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural computation 1997, 9(8), 1735–1780. [Google Scholar] [CrossRef]
- Kanaparthi, V. Credit risk prediction using ensemble machine learning algorithms. In 2023 International Conference on Inventive Computation Technologies (ICICT); IEEE, April 2023; pp. 41–47. [Google Scholar]
- Kanojia, S.; Arora, A. Machine learning for credit risk management through cross-economy evidence in default prediction. SN Business & Economics 2025, 5(12), 221. [Google Scholar] [CrossRef]
- Katzman, J. L.; Shaham, U.; Cloninger, A.; Bates, J.; Jiang, T.; Kluger, Y. DeepSurv: personalized treatment recommender system using a Cox proportional hazards deep neural network. BMC medical research methodology 2018, 18(1), 24. [Google Scholar] [CrossRef] [PubMed]
- Levy, A.; Baha, R. Credit risk assessment: a comparison of the performances of the linear discriminant analysis and the logistic regression. International Journal of Entrepreneurship and Small Business 2021, 42(1-2), 169–186. [Google Scholar] [CrossRef]
- Li, J.; Xu, C.; Feng, B.; Zhao, H. Credit risk prediction model for listed companies based on CNN-LSTM and attention mechanism. Electronics 2023, 12(7), 1643. [Google Scholar] [CrossRef]
- Liang, L.; Cai, X. Forecasting peer-to-peer platform default rate with LSTM neural network. Electronic Commerce Research and Applications 2020, 43, 100997. [Google Scholar] [CrossRef]
- Lin, M.; Chen, J. Research on credit big data algorithm based on logistic regression. Procedia Computer Science 2023, 228, 511–518. [Google Scholar] [CrossRef]
- Liu, J.; Zhang, S.; Fan, H. A two-stage hybrid credit risk prediction model based on XGBoost and graph-based deep neural network. Expert Systems with Applications 2022, 195, 116624. [Google Scholar] [CrossRef]
- Madaan, M.; Kumar, A.; Keshri, C.; Jain, R.; Nagrath, P. Loan default prediction using decision trees and random forest: A comparative study. In IOP conference series: materials science and engineering; IOP Publishing, January 2021; Vol. 1022, No. 1, p. 012042. [Google Scholar]
- Muddu, G.; Ganiyu, S. O.; Ejidokun, A. O.; Aleshinloye, Y. A. Integrated data-driven credit default prediction in Uganda using machine learning models. Journal of the Nigerian Society of Physical Sciences 2026, 2649–2649. [Google Scholar] [CrossRef]
- Said, I.; Qu, Y. A Study on the Performance Comparison of Five Popular Machine Learning Models Applied for Loan Risk Prediction. In 2022 International Conference on Computational Science and Computational Intelligence (CSCI); IEEE, December 2022; pp. 670–676. [Google Scholar]
- Sarkar, R. A Systematic Review of AI-Driven Credit Risk Assessment Models in Commercial Banking (2018–2026). American Journal of Interdisciplinary Studies 2026, 7(01), 459–495. [Google Scholar] [CrossRef]
- Shang, L.; Zhao, J.; Li, G.; Zhang, X. Survival analysis in credit risk management: A review study. Journal of Credit Risk 2025, 20(4), 59–83. [Google Scholar] [CrossRef]
- Soni, U.; Jethava, D. G.; Ganatra, A. Latest advancements in credit risk assessment with machine learning and deep learning techniques. Cybernetics and Information Technologies 2024, 24(4), 22–44. [Google Scholar] [CrossRef]
- Sujatha, R.; Kavitha, D.; Maheswari, B. U.; Ajay, K. G. Ensemble Machine Learning Models for Corporate Credit Risk Prediction: A Comparative Study. SN Computer Science 2025, 6(5), 514. [Google Scholar] [CrossRef]
- Tian, Z.; Xiao, J.; Feng, H.; Wei, Y. Credit risk assessment based on gradient boosting decision tree. Procedia Computer Science 2020, 174, 150–160. [Google Scholar] [CrossRef]
- Tian, Y.; Wu, Y. Systemic Financial Risk Forecasting: A Novel Approach with IGSA-RBFNN. Mathematics 2024, 12(11), 1610. [Google Scholar] [CrossRef]
- Wu, Z.; Liu, R.; Dai, J.; Luo, D. Multimodal Insights into Credit Risk Modelling: Integrating Climate and Text Data for Default Prediction. arXiv 2026, arXiv:2601.00478. [Google Scholar] [CrossRef]
- Yassin, A. A.; Haleeb, A.; Alnagar, D. K.; Hussein, E. M.; SidAhmed Mustafa, M.; Ahmed Elsheikh, S. M.; Awad, W. Modeling Financial Risk Using Discriminant Analysis: A Predictive Approach. Pakistan Journal of Life & Social Sciences 2024, 22(2). [Google Scholar]
- Yu, D.; Fang, A. Achieving credit risk prediction framework for Chinese CBECEs: a hybrid CNN-BiLSTM-AM approach. Electronic Commerce Research 2025, 1–24. [Google Scholar] [CrossRef]
- Zhang, X.; Ma, Y.; Wang, M. An attention-based Logistic-CNN-BiLSTM hybrid neural network for credit risk prediction of listed real estate enterprises. Expert systems 2024, 41(2), e13299. [Google Scholar] [CrossRef]
- Zhou, X.; Zhang, W.; Jiang, Y. Personal credit default prediction model based on convolution neural network. Mathematical Problems in Engineering 2020, 2020(1), 5608392. [Google Scholar] [CrossRef]







| Model | Hyperparameter | Search Space |
| Cox Proportional Hazards | L2 Regularization (α) | [1e−5, 1e−1] |
| Elastic Net Mixing Ratio | [0.0, 1.0] | |
| XGBoost | Number of Estimators | 100,800 |
| Learning Rate (η) | [0.01, 0.30] | |
| Maximum Depth | 3,10 | |
| Subsample Ratio | [0.60, 1.00] | |
| Column Sample by Tree | [0.50, 1.00] | |
| Gamma | 0,l | |
| Minimum Child Weight | 1,10 | |
| LSTM | Learning Rate | [1e−4, 1e−2] |
| Number of LSTM Units | 32,256 | |
| Number of LSTM Layers | 1,3 | |
| Dropout Rate | [0.10, 0.50] | |
| Sequence Length (Timesteps) | 3,12 | |
| Batch Size | {32, 64, 128} | |
| Optimizer | {Adam, RMSprop} | |
| Activation Function | {Tanh, ReLu} | |
| Epochs | {30,300} | |
| DeepSurv | Learning Rate | [1e−5, 1e−3] |
| Number of Hidden Layers | 1,4 | |
| Neurons per Layer | 32,256 | |
| Dropout Rate | [0.10, 0.40] | |
| L2 Regularisation (λ) | [1e−6, 1e−3] | |
| Batch Size | {32, 64, 128} | |
| Hybrid DeepSurv–LSTM | DeepSurv Hidden Units | 32,128 |
| LSTM Units | 64,256 | |
| Joint Learning Rate | [1e−5, 1e−3] | |
| Dropout Rate | [0.10, 0.40] | |
| Fusion Weight | [0.30, 0.80] |
| Model | Hyperparameter | Optimal Value |
| Cox Proportional Hazards | L2 Regularisation (λ) | 0.0125 |
| Tolerance | 1.0 × 10⁻⁴ | |
| Maximum Iterations | 250 | |
| XGBoost | Number of Estimators | 600 |
| Learning Rate (η) | 0.08 | |
| Maximum Depth | 6 | |
| Subsample Ratio | 0.85 | |
| Column Sample by Tree | 0.75 | |
| Gamma | 0.8 | |
| Minimum Weight | 4 | |
| LSTM | Learning Rate | 1.0 × 10⁻³ |
| LSTM Units | 128 | |
| LSTM Layers | 2 | |
| Dropout Rate | 0.30 | |
| Sequence Length | 6 | |
| Batch Size | 64 | |
| Optimizer | Adam | |
| Epochs | 100 | |
| Activation Function | ReLU | |
|
DeepSurv |
Learning Rate | 3.5 × 10⁻⁴ |
| Hidden Layers | 3 | |
| Neurons per Layer | 128 | |
| Dropout Rate | 0.25 | |
| L2 Regularisation | 5.0 × 10⁻⁵ | |
| Batch Size | 64 | |
| Activation Function | ReLU | |
|
Hybrid DeepSurv–LSTM |
DeepSurv Hidden Units | 64 |
| LSTM Units | 128 | |
| Joint Learning Rate | 5.0 × 10⁻⁴ | |
| Dropout Rate | 0.25 | |
| Fusion Weight | 0.60 |
| Non-Bayesian-Optimised Models | ||||||
| Metric | XGBoost | DeepSurv | LSTM | DeepSurv–LSTM | ||
| C-Index | 0.7413 | 0.7682 |
0.7844 |
0.8121 | ||
| IBS | 0.1925 | 0.1816 | 0.1738 | 0.1562 | ||
| Accuracy | 0.8814 | 0.8623 | 0.8736 | 0.9027 | ||
| Precision | 0.8542 | 0.8325 | 0.8417 | 0.8794 | ||
| Recall | 0.8318 | 0.8126 | 0.8245 | 0.8612 | ||
| F1-Score | 0.8429 | 0.8224 | 0.8330 | 0.8702 | ||
| ROC-AUC | 0.9086 | 0.8921 | 0.9017 | 0.9415 | ||
| Bayesian-Optimised Models | ||||||
| C-Index |
0.7816 |
0.8096 |
0.8279 | 0.8617 | ||
| IBS | 0.1712 | 0.1593 |
0.1486 |
0.1293 | ||
| Accuracy | 0.9142 | 0.8931 | 0.9025 | 0.9483 | ||
| Precision | 0.8875 | 0.8654 | 0.8762 | 0.9264 | ||
| Recall | 0.8721 | 0.8513 | 0.8618 | 0.9132 | ||
| F1-Score | 0.8797 | 0.8583 | 0.8689 | 0.9197 | ||
| Comparison | Hybrid Mean AUC | Model Mean AUC | t-statistic | p-value | Significant | |
| Hybrid vs XGBoost | 0.975 | 0.942 |
4.82 |
0.008 | Yes | |
| Hybrid vs LSTM | 0.975 | 0.933 | 5.11 | 0.006 | Yes | |
| Hybrid vs DeepSurv | 0.975 | 0.921 | 5.76 | 0.004 | Yes | |
| Hybrid vs DeepSurv–LSTM | 0.975 | 0.973 | 5.53 | 0.001 | Yes |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).