Preprint Article Version 1 Preserved in Portico This version is not peer-reviewed

Machine Learning for Credit Risk Prediction: A Systematic Literature Review

Version 1 : Received: 10 August 2023 / Approved: 11 August 2023 / Online: 11 August 2023 (13:26:43 CEST)

A peer-reviewed article of this Preprint also exists.

Noriega, J.P.; Rivera, L.A.; Herrera, J.A. Machine Learning for Credit Risk Prediction: A Systematic Literature Review. Data 2023, 8, 169. Noriega, J.P.; Rivera, L.A.; Herrera, J.A. Machine Learning for Credit Risk Prediction: A Systematic Literature Review. Data 2023, 8, 169.

Abstract

In this systematic review of the literature on using Machine Learning (ML) for credit risk prediction, we raise the need for financial institutions to use AI and ML to assess credit risk, analyzing large volumes of information. We posed research questions about algorithms, metrics, results, data sets, variables, and related limitations in predicting credit risk. We searched renowned databases to answer them and identified 52 relevant studies with the credit industry microfinance. Challenges and approaches in credit risk prediction using ML models we identified, difficulties with the implemented models such as the black box model, the need for explanatory artificial intelligence, the importance of selecting relevant features, addressing multicollinearity, and the problem of the imbalance in the input data. By answering the questions, we identified that the Boosted Category is the most researched family of ML models; the most commonly used metrics for evaluation are Area Under Curve (AUC), Accuracy (ACC), Recall, precision measure F1 (F1), and Precision; Research mainly uses public data sets to compare models, and private ones to generate new knowledge when applied to the real world. The most significant limitation identified is the representativeness of reality, and the variables primarily used in the microcredit industry are related data to the demographic, the operation, and payment behavior. This study aims to guide the developers of credit risk management tools and software towards the existing offer of ML methods, metrics, and techniques used to forecast it, thereby minimizing possible losses due to default and guiding risk appetite.

Keywords

loan; credit risk; prediction; machine learning; systematic literature review

Subject

Computer Science and Mathematics, Computer Science

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.

Leave a public comment
Send a private comment to the author(s)
* All users must log in before leaving a comment
Views 0
Downloads 0
Comments 0
Metrics 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.