Submitted:
19 January 2025
Posted:
21 January 2025
You are already at the latest version
Abstract
The rising incidence of credit card fraud underscores the need for innovative strategies to enhance credit card fraud detection and prevention. Numerous approaches have been employed for credit card fraud detection; however, the field continues to seek methods that can adapt to the constantly evolving nature of fraud patterns. In this study, we develop a hybrid model by integrating machine learning algorithms for effective credit card fraud detection. Using a simulated credit card transaction dataset, the model is developed in two stages, the first stage finds a base algorithm for the proposed hybrid model. The second stage focus on developing the hybrid model by combining the base model (Light Gradient Boosting Machine) with each of the selected algorithms. The hybrid models, demonstrated superior performance compare to standalone algorithms. Also, the hybrid of LGBM and XGBoost model outperforms others combinations, having 98.3% accuracy, 98.88% Precision, 98.05% Recall, 98.46% F1-Score, 99.80% AUROC. This proposed hybrid model can enhance security and foster trust in financial institutions and businesses, and in turn contribute to a more stable and efficient financial ecosystem.
Keywords:
1. Introduction
2. Related Works
- Ever evolving nature of fraud pattern.
- Interpretability of models, which is of great concerns especially in industries where regulatory compliance and transparency are paramount.
- Imbalanced datasets.
- Developing scalable models that ensure timely detection without compromising accuracy.
- Hybrid models show promise in enhancing accuracy, there is a lack of consensus on the optimal combination of algorithms for specific contexts.
3. Methodology
4. Experimentation
4.1. Data Exploration
4.2. Data Preparation and Preprocessing
4.3. System setup and Experiments
4.4. Result Presentation and Discussion
5. Conclusions
References
- Pandey, K.; Sachan, P.; Ganpatrao, N.G.; et al. A review of credit card fraud detection techniques. In Proceedings of the 2021 5th international conference on computing methodologies and communication (ICCMC). IEEE, 2021, pp. 1645–1653.
- Lokanan, M.E. Financial fraud detection: the use of visualization techniques in credit card fraud and money laundering domains. Journal of Money Laundering Control 2022, 26, 436–444. [Google Scholar] [CrossRef]
- The Nilson Report. Global Card Fraud Losses Continue to Rise, 2018.
- The Nilson Report. Global Card Fraud Losses Continue to Rise, 2022.
- AARP. Identity Fraud Report 2023. https://www.aarp.org/money/scams-fraud/info-2024/identity-fraud-report.html, 2023. Accessed: YYYY-MM-DD.
- Experian. Steps to Take if You Are the Victim of Credit Card Fraud. https://www.experian.com/blogs/ask-experian/steps-to-take-if-you-are-the-victim-of-credit-card-fraud/, n.d.
- Thennakoon, A.; Bhagyani, C.; Premadasa, S.; Mihiranga, S.; Kuruwitaarachchi, N. Real-time credit card fraud detection using machine learning. In Proceedings of the 2019 9th International Conference on Cloud Computing, Data Science & Engineering (Confluence). IEEE, 2019, pp. 488–493.
- Kamusweke, K.; Nyirenda, M.; Kabemba, M. Data mining for fraud detection in large scale financial transactions. EasyChair 2019. [Google Scholar]
- Lim, K.S.; Lee, L.H.; Sim, Y.W. A review of machine learning algorithms for fraud detection in credit card transaction. International Journal of Computer Science & Network Security 2021, 21, 31–40. [Google Scholar]
- Barman, S.; Pal, U.; Sarfaraj, M.A.; Biswas, B.; Mahata, A.; Mandal, P. A complete literature review on financial fraud detection applying data mining techniques. International Journal of Trust Management in Computing and Communications 2016, 3, 336–359. [Google Scholar] [CrossRef]
- Padvekar, S.A.; Kangane, P.M.; Jadhav, K.V. Credit card fraud detection system. International Journal Of Engineering And Computer Science 2016. [Google Scholar] [CrossRef]
- Mathur, S.; Daniel, S. It’s Fraud! Application of Machine Learning Techniques for Detection of Fraudulent Digital Advertising. Webology 2022, 19, 2475–2490. [Google Scholar] [CrossRef]
- Cortez, P.; Embrechts, M.J. Using sensitivity analysis and visualization techniques to open black box data mining models. Information Sciences 2013, 225, 1–17. [Google Scholar] [CrossRef]
- Maleki, F.; Muthukrishnan, N.; Ovens, K.; Reinhold, C.; Forghani, R. Machine learning algorithm validation: from essentials to advanced applications and implications for regulatory certification and deployment. Neuroimaging Clinics 2020, 30, 433–445. [Google Scholar] [CrossRef]
- Orzechowski, P.; Boryczko, K. Hybrid biclustering algorithms for data mining. In Proceedings of the Applications of Evolutionary Computation: 19th European Conference, EvoApplications 2016, Porto, Portugal, March 30–April 1, 2016, Proceedings, Part I 19. Springer, 2016, pp. 156–168.
- Xie, Y.; Li, A.; Gao, L.; Liu, Z. A heterogeneous ensemble learning model based on data distribution for credit card fraud detection. Wireless Communications and Mobile Computing 2021, 2021, 2531210. [Google Scholar] [CrossRef]
- Kim, E.; Lee, J.; Shin, H.; Yang, H.; Cho, S.; Nam, S.k.; Song, Y.; Yoon, J.a.; Kim, J.i. Champion-challenger analysis for credit card fraud detection: Hybrid ensemble and deep learning. Expert Systems with Applications 2019, 128, 214–224. [Google Scholar] [CrossRef]
- Phua, C.; Lee, V.; Smith, K.; Gayler, R. A comprehensive survey of data mining-based fraud detection research. arXiv, 2010; arXiv:1009.6119. [Google Scholar]
- Bhowmik, R. Data mining techniques in fraud detection. Journal of Digital Forensics, Security and Law 2008, 3, 3. [Google Scholar] [CrossRef]
- Bagga, S.; Goyal, A.; Gupta, N.; Goyal, A. Credit card fraud detection using pipeling and ensemble learning. Procedia Computer Science 2020, 173, 104–112. [Google Scholar] [CrossRef]
- Jain, R.; Gour, B.; Dubey, S. A hybrid approach for credit card fraud detection using rough set and decision tree technique. International Journal of Computer Applications 2016, 139, 1–6. [Google Scholar] [CrossRef]
- Zareapoor, M.; Seeja, K.; Alam, M.A. Analysis on credit card fraud detection techniques: based on certain design criteria. International journal of computer applications 2012, 52. [Google Scholar] [CrossRef]
- Sharma, P.; Banerjee, S.; Tiwari, D.; Patni, J.C. Machine learning model for credit card fraud detection-a comparative analysis. Int. Arab J. Inf. Technol. 2021, 18, 789–796. [Google Scholar] [CrossRef]
- Leevy, J.L.; Hancock, J.; Khoshgoftaar, T.M. Comparative analysis of binary and one-class classification techniques for credit card fraud data. Journal of Big Data 2023, 10, 118. [Google Scholar] [CrossRef]
- Dornadula, V.N.; Geetha, S. Credit card fraud detection using machine learning algorithms. Procedia computer science 2019, 165, 631–641. [Google Scholar] [CrossRef]
- Bahnsen, A.C.; Aouada, D.; Stojanovic, A.; Ottersten, B. Feature engineering strategies for credit card fraud detection. Expert Systems with Applications 2016, 51, 134–142. [Google Scholar] [CrossRef]
- Li, Z.; Li, J.; Wang, Y.; Wang, K. A deep learning approach for anomaly detection based on SAE and LSTM in mechanical equipment. The International Journal of Advanced Manufacturing Technology 2019, 103, 499–510. [Google Scholar] [CrossRef]
- Dighe, D.; Patil, S.; Kokate, S. Detection of credit card fraud transactions using machine learning algorithms and neural networks: A comparative study. In Proceedings of the 2018 Fourth International Conference on Computing Communication Control and Automation (ICCUBEA). IEEE, 2018, pp. 1–6.
- Alenzi, H.Z.; Aljehane, N.O. Fraud detection in credit cards using logistic regression. International Journal of Advanced Computer Science and Applications 2020, 11. [Google Scholar] [CrossRef]
- Zhu, H.; Liu, G.; Zhou, M.; Xie, Y.; Abusorrah, A.; Kang, Q. Optimizing weighted extreme learning machines for imbalanced classification and application to credit card fraud detection. Neurocomputing 2020, 407, 50–62. [Google Scholar] [CrossRef]
- Chen, M. Credit Card Fraud Detection Based on Multiple Machine Learning Models. In Proceedings of the Proceedings of the 2022 6th International Conference on Electronic Information Technology and Computer Engineering, 2022, pp. 1801–1805.
- Dal Pozzolo, A.; Boracchi, G.; Caelen, O.; Alippi, C.; Bontempi, G. Credit card fraud detection: a realistic modeling and a novel learning strategy. IEEE transactions on neural networks and learning systems 2017, 29, 3784–3797. [Google Scholar] [CrossRef] [PubMed]
- Cherif, A.; Badhib, A.; Ammar, H.; Alshehri, S.; Kalkatawi, M.; Imine, A. Credit card fraud detection in the era of disruptive technologies: A systematic review. Journal of King Saud University-Computer and Information Sciences 2023, 35, 145–174. [Google Scholar] [CrossRef]
- Kültür, Y.; Çağlayan, M.U. Hybrid approaches for detecting credit card fraud. Expert Systems 2017, 34, e12191. [Google Scholar] [CrossRef]
- Kumar, J.; Saxena, V. Rule-based credit card fraud detection using user’s keystroke behavior. In Soft Computing: Theories and Applications: Proceedings of SoCTA 2021; Springer, 2022; pp. 469–480.
- Li, W.; Paraschiv, F.; Sermpinis, G. A data-driven explainable case-based reasoning approach for financial risk detection. Quantitative Finance 2022, 22, 2257–2274. [Google Scholar] [CrossRef]
- Adewumi, A.O.; Akinyelu, A.A. A survey of machine-learning and nature-inspired based credit card fraud detection techniques. International Journal of System Assurance Engineering and Management 2017, 8, 937–953. [Google Scholar] [CrossRef]
- Delimata, P.; Suraj, Z. Hybrid methods in data classification and reduction. Rough Sets and Intelligent Systems-Professor Zdzisław Pawlak in Memoriam: Volume 2 2013, pp. 263–291.
- Tripathi, K.K.; Ragha, L. Hybrid approach for credit card fraud detection. Int. J. Soft Comput. Eng.(IJSCE) 2013, 3. [Google Scholar]
- Sohony, I.; Pratap, R.; Nambiar, U. Ensemble learning for credit card fraud detection. In Proceedings of the Proceedings of the ACM India joint international conference on data science and management of data, 2018, pp. 289–294.
- Randhawa, K.; Loo, C.K.; Seera, M.; Lim, C.P.; Nandi, A.K. Credit card fraud detection using AdaBoost and majority voting. IEEE access 2018, 6, 14277–14284. [Google Scholar] [CrossRef]
- Tiwari, P.; Mehta, S.; Sakhuja, N.; Gupta, I.; Singh, A.K. Hybrid method in identifying the fraud detection in the credit card. In Proceedings of the Evolutionary Computing and Mobile Sustainable Networks: Proceedings of ICECMSN 2020. Springer, 2021, pp. 27–35.
- Dai, Y.; Yan, J.; Tang, X.; Zhao, H.; Guo, M. Online credit card fraud detection: a hybrid framework with big data technologies. In Proceedings of the 2016 IEEE Trustcom/BigDataSE/ISPA. IEEE, 2016, pp. 1644–1651.
- Ojugo, A.A.; Nwankwo, O. Spectral-cluster solution for credit-card fraud detection using a genetic algorithm trained modular deep learning neural network. JINAV: Journal of Information and Visualization 2021, 2, 15–24. [Google Scholar] [CrossRef]
- Mienye, I.D.; Sun, Y. A survey of ensemble learning: Concepts, algorithms, applications, and prospects. IEEE Access 2022, 10, 99129–99149. [Google Scholar] [CrossRef]
- Harris, B. Sparkov Data Generation Tool. https://github.com/namebrandon/Sparkov_Data_Generation, 2022.
- Makki, S.; Assaghir, Z.; Taher, Y.; Haque, R.; Hacid, M.S.; Zeineddine, H. An experimental study with imbalanced classification approaches for credit card fraud detection. IEEE Access 2019, 7, 93010–93022. [Google Scholar] [CrossRef]







| Part | |||||
|---|---|---|---|---|---|
| Models | Accuracy | Precision | Recall | F1 Score | AUROC |
| XGBoost | 0.78 | 0.73 | 0.78 | 0.75 | 0.89 |
| LGBM | 0.84 | 0.76 | 0.81 | 0.78 | 0.82 |
| Decision Trees | 0.62 | 0.61 | 0.55 | 0.58 | 0.63 |
| Neural Networks | 0.71 | 0.68 | 0.72 | 0.70 | 0.60 |
| Random Forest | 0.79 | 0.80 | 0.75 | 0.77 | 0.81 |
| SVM | 0.76 | 0.74 | 0.63 | 0.68 | 0.74 |
| Adaboost | 0.78 | 0.65 | 0.71 | 0.67 | 0.72 |
| Evaluation Metrics | |||||||
|---|---|---|---|---|---|---|---|
| Models | Accuracy | Precision | Recall | F1 Score | AUROC | Type-1 Error | Type-2 Error |
| LGBM + XGBoost | 0.9830 | 0.9888 | 0.9805 | 0.9846 | 0.9980 | 0.0140 | 0.0195 |
| LGBM + Decision Trees | 0.9773 | 0.9792 | 0.9800 | 0.9796 | 0.9770 | 0.0261 | 0.020 |
| LGBM + Neural Networks | 0.8847 | 0.9511 | 0.8361 | 0.8899 | 0.9472 | 0.0541 | 0.1639 |
| LGBM + Random Forest | 0.9750 | 0.9831 | 0.9719 | 0.9775 | 0.9967 | 0.0210 | 0.0281 |
| LGBM + SVM | 0.8693 | 0.9322 | 0.8255 | 0.8756 | 0.9363 | 0.0755 | 0.1745 |
| LGBM + Adaboost | 0.9214 | 0.9615 | 0.8948 | 0.9270 | 0.9689 | 0.0451 | 0.1052 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).