Submitted:
14 December 2023
Posted:
15 December 2023
You are already at the latest version
Abstract
Keywords:
1. Introduction

2. Machine Learning
3. Traditional ML Approaches to Fraud Detection
4. Deep Learning Approaches to Fraud Detection
5. Common Fraud Detection Datasets
6. Conclusion
Funding
Data Availability Statement
Acknowledgment
Conflict of Interest
References
- Y. Y. Festa and I. A. Vorobyev, "A hybrid machine learning framework for e-commerce fraud detection," Model Assisted Statistics and Applications, vol. 17, pp. 41-49, 2022. [CrossRef]
- N. Kumaraswamy, T. Ekin, C. Park, M. K. Markey, J. C. Barner, and K. Rascati, "Using a Bayesian Belief Network to detect healthcare fraud," Expert Systems with Applications, vol. 238, 2024. [CrossRef]
- G. Tong and J. Shen, "Financial transaction fraud detector based on imbalance learning and graph neural network," Applied Soft Computing, vol. 149, 2023. [CrossRef]
- Y. Bing Chu, Z. Min Lim, B. Keane, P. Hao Kong, A. Rafat Elkilany, and O. Hisham Abusetta, "Credit Card Fraud Detection on Original European Credit Card Holder Dataset Using Ensemble Machine Learning Technique," Journal of Cyber Security, vol. 5, pp. 33-46, 2023. [CrossRef]
- Y. Fang, Y. Zhang, and C. Huang, "Credit Card Fraud Detection Based on Machine Learning," Computers, Materials & Continua, vol. 61, pp. 185-195, 2019. [CrossRef]
- S. Vaithyasubramanian, D. Saravanan, and C. K. Kirubhashankar, "Communal Fraud Detection Algorithm for Establishing Identity Thefts in Online Shopping," International Journal of e-Collaboration, vol. 17, pp. 75-84, 2021. [CrossRef]
- S. Xiao, T. Bai, X. Cui, B. Wu, X. Meng, and B. Wang, "A graph-based contrastive learning framework for medicare insurance fraud detection," Frontiers of Computer Science, vol. 17, 2023. [CrossRef]
- X. Hu, H. Chen, S. Liu, H. Jiang, G. Chu, and R. Li, "BTG: A Bridge to Graph machine learning in telecommunications fraud detection," Future Generation Computer Systems, vol. 137, pp. 274-287, 2022. [CrossRef]
- M. A. Ferrag, L. Shu, O. Friha, and X. Yang, "Cyber Security Intrusion Detection for Agriculture 4.0: Machine Learning-Based Solutions, Datasets, and Future Directions," IEEE/CAA Journal of Automatica Sinica, vol. 9, pp. 407-436, 2022. [CrossRef]
- A. K. Junejo, M. Breza, and J. A. McCann, "Threat Modeling for Communication Security of IoT-Enabled Digital Logistics," Sensors (Basel), vol. 23, 2023. [CrossRef]
- M. Abdalsalam, C. Li, A. Dahou, and N. Kryvinska, "Terrorism Attack Classification Using Machine Learning: The Effectiveness of Using Textual Features Extracted from GTD Dataset," Computer Modeling in Engineering & Sciences, vol. 138, pp. 1427-1467, 2024. [CrossRef]
- J. T. Hancock, R. A. Bauder, H. Wang, and T. M. Khoshgoftaar, "Explainable machine learning models for Medicare fraud detection," Journal of Big Data, vol. 10, 2023. [CrossRef]
- S. Subudhi and S. Panigrahi, "Two-Stage Automobile Insurance Fraud Detection by Using Optimized Fuzzy C-Means Clustering and Supervised Learning," International Journal of Information Security and Privacy, vol. 14, pp. 18-37, 2020. [CrossRef]
- J. Debener, V. Heinke, and J. Kriebel, "Detecting insurance fraud using supervised and unsupervised machine learning," Journal of Risk and Insurance, vol. 90, pp. 743-768, 2023. [CrossRef]
- F. Carcillo, Y.-A. Le Borgne, O. Caelen, Y. Kessaci, F. Oblé, and G. Bontempi, "Combining unsupervised and supervised learning in credit card fraud detection," Information Sciences, vol. 557, pp. 317-331, 2021. [CrossRef]
- "WXGCB: A Clustering Prior Weighting Semi-Supervised Learning Method Based on Space Level Constraint and Mixed Variable Metrics," Advances in Computer, Signals and Systems, vol. 7, 2023. [CrossRef]
- Y. Huang, W. Liu, S. Li, Y. Guo, and W. Chen, "A Novel Unsupervised Outlier Detection Algorithm Based on Mutual Information and Reduced Spectral Clustering," Electronics, vol. 12, 2023. [CrossRef]
- N. Kumaraswamy, M. K. Markey, J. C. Barner, and K. Rascati, "Feature engineering to detect fraud using healthcare claims data," Expert Systems with Applications, vol. 210, 2022. [CrossRef]
- A. J. Mary and S. P. A. Claret, "Design and development of big data-based model for detecting fraud in healthcare insurance industry," Soft Computing, vol. 27, pp. 8357-8369, 2023. [CrossRef]
- J. Li and D. Yang, "Research on Financial Fraud Detection Models Integrating Multiple Relational Graphs," Systems, vol. 11, 2023. [CrossRef]
- I. Kose, M. Gokturk, and K. Kilic, "An interactive machine-learning-based electronic fraud and abuse detection system in healthcare insurance," Applied Soft Computing, vol. 36, pp. 283-299, 2015. [CrossRef]
- Y. Zhang, "Preliminary research on abnormal brain detection by wavelet-energy and quantum-behaved PSO," Technology and Health Care, vol. 24, pp. S641-S649, 2016. [CrossRef]
- S.-H. Wang, "Multiple Sclerosis Identification Based on Fractional Fourier Entropy and a Modified Jaya Algorithm," Entropy, vol. 20, p. 254, 2018. [CrossRef]
- W. T. Kerr and K. N. McFarlane, "Machine Learning and Artificial Intelligence Applications to Epilepsy: a Review for the Practicing Epileptologist," Curr Neurol Neurosci Rep, 2023. [CrossRef]
- S. Salman, Q. Gu, R. Sharma, Y. Wei, B. Dherin, S. Reddy, et al., "Artificial intelligence and machine learning in aneurysmal subarachnoid hemorrhage: Future promises, perils, and practicalities," J Neurol Sci, vol. 454, p. 120832, 2023. [CrossRef]
- H. Dai, G. Cai, Z. Lin, Z. Wang, and Q. Ye, "Validation of Inertial Sensing-Based Wearable Device for Tremor and Bradykinesia Quantification," IEEE J Biomed Health Inform, vol. 25, pp. 997-1005, 2021. [CrossRef]
- H. Sibyan, W. Suharso, E. Suharto, M. A. Manuhutu, and A. P. Windarto, "Optimization of Unsupervised Learning in Machine Learning," Journal of Physics: Conference Series, vol. 1783, 2021. [CrossRef]
- S. Geoffrion, C. Morse, M. M. Dufour, N. Bergeron, S. Guay, and M. J. Lanovaz, "Screening for Psychological Distress in Healthcare Workers Using Machine Learning: A Proof of Concept," J Med Syst, vol. 47, p. 120, 2023. [CrossRef]
- X. Zhang, X. Ai, X. Wang, G. Zong, and J. Zhang, "A Study on the Effects of Digital Finance on Green Low-Carbon Circular Development Based on Machine Learning Models," Mathematics, vol. 11, 2023. [CrossRef]
- L. J. Paas, "Marketing analytics stages: Demystifying and deploying machine learning," International Journal of Market Research, vol. 65, pp. 687-707, 2023. [CrossRef]
- A. Pattison, W. Cipolli, J. Marichal, and C. Cherniakov, "Fracking Twitter: Utilizing machine learning and natural language processing tools for identifying coalition and causal narratives," Politics & Policy, vol. 51, pp. 755-774, 2023. [CrossRef]
- G. Balamurugan, C. Annadurai, I. Nelson, K. Nirmala Devi, A. S. Oliver, and S. Gomathi, "Optical bio sensor based cancer cell detection using optimized machine learning model with quantum computing," Optical and Quantum Electronics, vol. 56, 2023. [CrossRef]
- L. Malandri, F. Mercorio, M. Mezzanzanica, and A. Seveso, "Model-contrastive explanations through symbolic reasoning," Decision Support Systems, vol. 176, 2024. [CrossRef]
- Y. Zhang, "Pathological brain detection in MRI scanning via Hu moment invariants and machine learning," Journal of Experimental & Theoretical Artificial Intelligence, vol. 29, pp. 299-312, 2017. [CrossRef]
- Y. Zhang, "Pathological brain detection in MRI scanning by wavelet packet Tsallis entropy and fuzzy support vector machine," SpringerPlus, vol. 4, Article ID: 716, 2015. [CrossRef]
- S. Yang, P. Varghese, E. Stephenson, K. Tu, and J. Gronsbell, "Machine learning approaches for electronic health records phenotyping: a methodical review," J Am Med Inform Assoc, vol. 30, pp. 367-381, 2023. [CrossRef]
- D. C. Gkikas, P. K. Theodoridis, T. Theodoridis, and M. C. Gkikas, "Finding Good Attribute Subsets for Improved Decision Trees Using a Genetic Algorithm Wrapper; a Supervised Learning Application in the Food Business Sector for Wine Type Classification," Informatics, vol. 10, 2023. [CrossRef]
- K. Vo, J. Jonnagaddala, and S. T. Liaw, "Statistical supervised meta-ensemble algorithm for medical record linkage," J Biomed Inform, vol. 95, p. 103220, 2019. [CrossRef]
- A. Mignan, "A preliminary text classification of the precursory accelerating seismicity corpus: inference on some theoretical trends in earthquake predictability research from 1988 to 2018," Journal of Seismology, vol. 23, pp. 771-785, 2019. [CrossRef]
- L. N. Wu, "Pattern Recognition via PCNN and Tsallis Entropy," Sensors, vol. 8, pp. 7518-7529, 2008. [CrossRef]
- Y. Zhang, "Color Image Enhancement based on HVS and PCNN," SCIENCE CHINA Information Sciences, vol. 53, pp. 1963-1976, 2010. [CrossRef]
- O. Iparraguirre-Villanueva, V. Guevara-Ponce, F. Sierra-Linan, S. Beltozar-Clemente, and M. Cabanillas-Carbonell, "Sentiment Analysis of Tweets using Unsupervised Learning Techniques and the K-Means Algorithm," International Journal of Advanced Computer Science and Applications, vol. 13, 2022. [CrossRef]
- T. Barbariol and G. A. Susto, "TiWS-iForest: Isolation forest in weakly supervised and tiny ML scenarios," Information Sciences, vol. 610, pp. 126-143, 2022. [CrossRef]
- S. Lin, G. Mengaldo, and R. Maulik, "Online data-driven changepoint detection for high-dimensional dynamical systems," Chaos, vol. 33, 2023. [CrossRef]
- Y. Wang, Y. Liu, J. Zhao, and Q. Zhang, "A Low-Complexity Fast CU Partitioning Decision Method Based on Texture Features and Decision Trees," Electronics, vol. 12, 2023. [CrossRef]
- X. Li, M. Sale, K. Nieforth, K. L. Bigos, J. Craig, F. Wang, et al., "pyDarwin: A Machine Learning Enhanced Automated Nonlinear Mixed-effect Model Selection Toolbox," Clin Pharmacol Ther, 2023. [CrossRef]
- M. El Hajj and J. Hammoud, "Unveiling the Influence of Artificial Intelligence and Machine Learning on Financial Markets: A Comprehensive Analysis of AI Applications in Trading, Risk Management, and Financial Operations," Journal of Risk and Financial Management, vol. 16, 2023. [CrossRef]
- M. Maashi, B. Alabduallah, and F. Kouki, "Sustainable Financial Fraud Detection Using Garra Rufa Fish Optimization Algorithm with Ensemble Deep Learning," Sustainability, vol. 15, 2023. [CrossRef]
- S.-H. Wang and S. Fernandes, "AVNC: Attention-based VGG-style network for COVID-19 diagnosis by CBAM," IEEE Sensors Journal, vol. 22, pp. 17431 - 17438, 2022. [CrossRef]
- Y.-D. Zhang, "MIDCAN: A multiple input deep convolutional attention network for Covid-19 diagnosis based on chest CT and chest X-ray," Pattern Recognition Letters, vol. 150, pp. 8-16, 2021. [CrossRef]
- Z. Li, M. Huang, G. Liu, and C. Jiang, "A hybrid method with dynamic weighted entropy for handling the problem of class imbalance with overlap in credit card fraud detection," Expert Systems with Applications, vol. 175, 2021. [CrossRef]
- T. Barcin, M. A. Yucel, R. H. Ersan, M. A. Alagoz, A. Dogen, S. Burmaoglu, et al., "Deep learning approach to the discovery of novel bisbenzazole derivatives for antimicrobial effect," Journal of Molecular Structure, vol. 1295, 2024. [CrossRef]
- X. Wu, S. Jiang, G. Li, S. Liu, B. Metcalfe, L. Chen, et al., "Deep Learning With Convolutional Neural Networks for Motor Brain-Computer Interfaces Based on Stereo-Electroencephalography (SEEG)," IEEE J Biomed Health Inform, vol. 27, pp. 2387-2398, 2023. [CrossRef]
- M. R. Sarkar, S. G. Anavatti, T. Dam, M. M. Ferdaus, M. Tahtali, S. Ramasamy, et al., "GATE: A guided approach for time series ensemble forecasting," Expert Systems with Applications, vol. 235, 2024. [CrossRef]
- Y. Li, J. Cao, Y. Xu, L. Zhu, and Z. Y. Dong, "Deep learning based on Transformer architecture for power system short-term voltage stability assessment with class imbalance," Renewable and Sustainable Energy Reviews, vol. 189, 2024. [CrossRef]
- G. B. Kim, J. Y. Kim, J. A. Lee, C. J. Norsigian, B. O. Palsson, and S. Y. Lee, "Functional annotation of enzyme-encoding genes using deep learning with transformer layers," Nat Commun, vol. 14, p. 7370, 2023. [CrossRef]
- L. Wang, M. Ye, Y. Lu, Q. Qiu, Z. Niu, H. Shi, et al., "A combined encoder-transformer-decoder network for volumetric segmentation of adrenal tumors," Biomed Eng Online, vol. 22, p. 106, 2023. [CrossRef]
- Y. Zhang, "Deep learning in food category recognition," Information Fusion, vol. 98, p. 101859, 2023. [CrossRef]
- Z. S. Rubaidi, B. B. Ammar, and M. B. Aouicha, "Fraud Detection Using Large-scale Imbalance Dataset," International Journal on Artificial Intelligence Tools, vol. 31, 2022. [CrossRef]
- X.-X. Hou, "Voxelwise detection of cerebral microbleed in CADASIL patients by leaky rectified linear unit and early stopping," Multimedia Tools and Applications, vol. 77, pp. 21825-21845, 2018. [CrossRef]
- D. Sisodia and D. S. Sisodia, "A transfer learning framework towards identifying behavioral changes of fraudulent publishers in pay-per-click model of online advertising for click fraud detection," Expert Systems with Applications, vol. 232, 2023. [CrossRef]
- J. Chaquet-Ulldemolins, F.-J. Gimeno-Blanes, S. Moral-Rubio, S. Muñoz-Romero, and J.-L. Rojo-Álvarez, "On the Black-Box Challenge for Fraud Detection Using Machine Learning (II): Nonlinear Analysis through Interpretable Autoencoders," Applied Sciences, vol. 12, 2022. [CrossRef]
- Y. Gao, J. Liu, W. Li, M. Hou, Y. Li, and H. Zhao, "Augmented Grad-CAM++: Super-Resolution Saliency Maps for Visual Interpretation of Deep Neural Network," Electronics, vol. 12, 2023. [CrossRef]
- D. Probst, "An explainability framework for deep learning on chemical reactions exemplified by enzyme-catalysed reaction classification," J Cheminform, vol. 15, p. 113, 2023. [CrossRef]
- E. Kim, J. Lee, H. Shin, H. Yang, S. Cho, S.-k. Nam, et al., "Champion-challenger analysis for credit card fraud detection: Hybrid ensemble and deep learning," Expert Systems with Applications, vol. 128, pp. 214-224, 2019. [CrossRef]
- W. van Zetten, G. J. Ramackers, and H. H. Hoos, "Increasing trust and fairness in machine learning applications within the mortgage industry," Machine Learning with Applications, vol. 10, 2022. [CrossRef]
- Y. Liu, "Design of XGBoost prediction model for financial operation fraud of listed companies," International Journal of System Assurance Engineering and Management, vol. 14, pp. 2354-2364, 2023. [CrossRef]
- B. Xu, Y. Wang, X. Liao, and K. Wang, "Efficient fraud detection using deep boosting decision trees," Decision Support Systems, vol. 175, 2023. [CrossRef]
- A. Alharbi, M. Alshammari, O. D. Okon, A. Alabrah, H. T. Rauf, H. Alyami, et al., "A Novel text2IMG Mechanism of Credit Card Fraud Detection: A Deep Learning Approach," Electronics, vol. 11, 2022. [CrossRef]
- A. G. Sharifai and Z. B. Zainol, "Multiple Filter-Based Rankers to Guide Hybrid Grasshopper Optimization Algorithm and Simulated Annealing for Feature Selection With High Dimensional Multi-Class Imbalanced Datasets," IEEE Access, vol. 9, pp. 74127-74142, 2021. [CrossRef]
- J. Chen, Q. Chen, F. Jiang, X. Guo, K. Sha, and Y. Wang, "SCN_GNN: A GNN-based fraud detection algorithm combining strong node and graph topology information," Expert Systems with Applications, vol. 237, 2024. [CrossRef]
- R. Ramesh, R. Dodmane, S. Shetty, G. Aithal, M. Sahu, and A. Sahu, "A Novel and Secure Fake-Modulus Based Rabin-Ӡ Cryptosystem," Cryptography, vol. 7, 2023. [CrossRef]



| Dataset | Core Formula | Characteristic | Limitations |
|---|---|---|---|
| Bayesian Classification | ![]() |
Supports incremental training, assumes feature independence. | Unable to handle feature combinations. |
| Logistic Regression | ![]() |
Easy to model, supports incremental training. | Outlier sensitivity. |
| Support Vector Machine | ![]() |
Strong robustness, can be nonlinear mapping. | Difficult to implement large-scale training. |
| Decision Trees | ![]() |
Clear rules, can handle outliers. | Data-dependent, easy to overfitting. |
| k-Nearest Neighbor | ![]() |
Supports incremental training, robust to missing values, parameter independence and noise. | Slow classification speed, high cost of large-scale computation. |
| Ensemble Learning | ![]() |
Strong robustness and high accuracy. | Increase computational cost, poor parallelism of some algorithms. |
| Neural Network | ![]() |
Autonomous learning, can be nonlinear fitting. | Unable to fit complex functions, poor interpretability. |
| Deep Learning | ![]() |
Strong learning ability, high accuracy. | The computational and annotation costs are high, and some hyperparameters need to be determined empirically. |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).







