Submitted:
20 November 2025
Posted:
21 November 2025
You are already at the latest version
Abstract
Keywords:
1. Introduction
2. Literature Review
3. Materials and Methods
3.1. Traditional Credit Scoring Module
3.2. Internal Sales Analytics Module
3.3. Ensemble Integration
4. Experiments and Results
4.1. Data Sample and Preparation
4.2. Experimental Setup
- Data Collection and Data Pre-processing. These steps took 80% overall methodology.
- Training of the Traditional Model. Logistic regression was applied to the bank’s credit data to establish baseline risk scores, serving as a foundation for comparison.
- Development of the Sales Analytics Module. Advanced machine learning techniques, specifically random forests and gradient boosting, were utilized to extract meaningful predictive features from the sales data, thereby enhancing the model’s robustness.
- Ensemble Integration. The outputs from the traditional credit scoring model and the sales analytics module were integrated using a weighted ensemble approach, allowing for a more comprehensive risk assessment.
- Evaluation. The performance of both the individual models and the integrated hybrid model was rigorously evaluated on the testing set using a comprehensive suite of performance metrics, ensuring a thorough understanding of each model’s effectiveness.
4.3. Performance Metrics
4.4. Results
5. Discussion
5.1. Integration Benefits
5.2. Broader Implications
5.3. Digitalization and Innovation Impacts
6. Conclusion
Acknowledgments
References
- Siddiqi, N. Intelligent Credit Scoring: Building and Implementing Better Credit Risk Scorecards; Wiley: Hoboken, NJ, USA, 2017. [Google Scholar] [CrossRef]
- Hand, D. J.; Henley, W. E. Statistical classification methods in consumer credit scoring: A review. J. R. Stat. Soc. A 1997, 160, 523–541. [Google Scholar] [CrossRef]
- Bastos, J. Forecasting bank loans loss-given-default. J. Bank. Finance 2009, 34, 2510–2517. [Google Scholar] [CrossRef]
- Lessmann, S.; Baesens, B.; Seow, H.-V.; Thomas, L. Benchmarking state-of-the-art classification algorithms for credit scoring: An update of research. Eur. J. Oper. Res. 2015. [Google Scholar] [CrossRef]
- Florez, R.; Ramon, J. Enhancing accuracy and interpretability of ensemble strategies in credit risk assessment: A correlated-adjusted decision forest proposal. Expert Syst. Appl. 2015, 42. [Google Scholar] [CrossRef]
- Rogojan, L.; Croicu, A.; Iancu, L. Modern approaches in credit risk modeling: A literature review. Proc. Int. Conf. Bus. Excell. 2023, 17, 1617–1627. [Google Scholar] [CrossRef]
- Ali, M.; Razaque, A.; Yoo, J.; Kabievna, U.; Moldagulova, A.; Satybaldiyeva, R.; Zhuldyz, K.; Kassymova, A. Designing an intelligent scoring system for crediting manufacturers and importers of goods in Industry 4.0. Logistics 2024, 8. [Google Scholar] [CrossRef]
- Khandani, A.; Kim, A.; Lo, A. Consumer credit-risk models via machine-learning algorithms. J. Bank. Finance 2010, 34, 2767–2787. [Google Scholar] [CrossRef]
- Bücker, M.; Szepannek, G.; Gosiewska, A.; Biecek, P. Transparency, auditability and explainability of machine learning models in credit scoring. J. Oper. Res. Soc. 2020, arXiv:2009.1338473, 70–90. [Google Scholar] [CrossRef]
- Soloshenko, O. M. K-plus-nearest neighbor method development for credit scoring machine learning tasks. East.-Eur. J. Enterp. Technol. 2015, 3, 29–38. [Google Scholar] [CrossRef]
- Nwaimo, C. S.; Adegbola, A. E.; Adegbola, M. D. Predictive analytics for financial inclusion: Using machine learning to improve credit access for underbanked populations. Comput. Sci. IT Res. J. 2024, 5, 1358–1373. [Google Scholar] [CrossRef]
- Chang, A.; Yang, L.-K.; Tsaih, R.-H.; Lin, S.-K. Machine learning and artificial neural networks to construct P2P lending credit-scoring model: A case using Lending Club data. Quant. Finance Econ. 2022, 6, 303–325. [Google Scholar] [CrossRef]
- Zhao, J.; Li, B. Credit risk assessment of SMEs in supply chain finance based on SVM and BP neural network. Neural Comput. Appl. 2022, 34, 12467–12478. [Google Scholar] [CrossRef]
- Tyumambayeva, A.; Abdeshova, A. B. Current state of bank lending to small and medium-sized businesses in Kazakhstan. Bull. Kazakh Univ. Econ. Finance Int. Trade 2022, 4. [Google Scholar] [CrossRef]
- Roy, P. K.; Shaw, K. A multicriteria credit scoring model for SMEs using hybrid BWM and TOPSIS. Financ. Innov. 2021, 7. [Google Scholar] [CrossRef]
- Zhang, L.; Wei, M.; Yi, Z. Credit decision system for MSMEs based on neural network and nonlinear programming. In Proceedings of the International Conference on Signal Processing (ICSP); 2022; pp. 1500–1506. [Google Scholar] [CrossRef]
- Cahyani, D.; Hazmi, Y.; Khairun Zuhra, N.; Wildani, R. R. Evaluation of the implementation of the credit sales accounting information system. West Sci. Soc. Humanit. Stud. 2024, 2, 109–1098. [Google Scholar] [CrossRef]
- Kedi, W. E.; Ejimuda, C.; Idemudia, C.; Ijomah, T. I. AI software for personalized marketing automation in SMEs: Enhancing customer experience and sales. World J. Adv. Res. Rev. 2024. [Google Scholar] [CrossRef]
- Byun, W. J.; Choi, B.; Kim, S.; Jo, J. Practical application of deep reinforcement learning to optimal trade execution. FinTech 2023, 2, 414–429. [Google Scholar] [CrossRef]
- Opoku, E.,; Aribigbola, M. Enhancing small and medium-sized businesses through digitalization. World Journal of Advanced Research and Reviews 2024, 23, 239–249. [Google Scholar] [CrossRef]


| Authors (year) | Methods/Models | Object of study | Key contributions/Results | Restrictions/Gaps in research |
|---|---|---|---|---|
| DJ Hand and W. E. Henley (1997) [2] | Methods of statistical classification (discriminant analysis, logistic regression) | Consumer credit scoring | A classic review summarizing early statistical approaches | No machine learning is used; static models are data-dependent |
| J. Bastos (2009) [3] | Loss Given Default (LGD) Prediction Based on Regression | Bank loan portfolios | Quantitative Loss Given Default (LGD) Model | Narrow focus (default loss only); no behavioral or adaptive modeling |
| A. Khandani, A. Kim, and A. Lo (2010) [8] | Machine learning (SVM, RF, boosting ) | Consumer lending risk | Innovative application of machine learning to predict credit risk | Does not address transparency or regulatory compliance |
| O.M. Soloshenko (2015) [10] | K-Plus Nearest Neighbor (K+NN) | Consumer loan | Improved adaptation of KNN to evaluation tasks | High computational complexity; lack of interpretability |
| S. Lessmann et al. (2015) [4] | Comparative analysis of machine learning algorithms (support vector machines, random forests, neural networks) | Credit scoring datasets | Demonstrated superiority of ML methods over classical statistical models | High accuracy but limited interpretability and adaptability |
| R. Florez and J. Ramon (2015) [5] | Ensemble learning (correlated-corrected decision forest) | Credit risk assessment | Improved balance between accuracy and interpretability | Still static; no self-learning or real-time adjustments |
| N. Siddiqui (2017) [1] | Development of traditional statistical indicator systems; logistic regression; expert systems | Consumer and retail credit | A comprehensive methodology for constructing interpretable scorecards used in banking practice | Limited adaptability and automation; lack of machine learning and dynamic learning |
| M. Bücker et al. (2020) [9] | Explainable Machine Learning (XAI, SHAP/LIME) | Credit scoring models | Focus on transparency, verifiability and explainability | Trade-off between accuracy and interpretability; lack of adaptability |
| W. Byun et al. (2023) [19] | Practical Application of Deep Reinforcement Learning to Optimal Trade Execution | Optimally execute large stock orders over varying time horizons in realistic market conditions | PPO-LSTM-based deep reinforcement learning model | Lack of stress-scenario evaluation |
| Variable | Credit Data | Sales Data |
|---|---|---|
| Mean Credit Score | 650 ± 45 | - |
| Mean Annual Revenue | 210 ±45 | 218 ±49 |
| Average Daily Sales Volume | - | 150 ±20 |
| Seasonality Index | - | 1.5 ±0.3 |
| Collateral Value | 126 ±21 | - |
| Metric | Traditional Model | Hybrid Model |
|---|---|---|
| AUC | 0.76 | 0.87 |
| Accuracy% | 0.78 | 0.85 |
| Mean Squared Error (MSE) | 0.12 | 0.08 |
| Sensitivity | 0.75 | 0.83 |
| Specificity | 0.80 | 0.87 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).