Submitted:
17 June 2025
Posted:
19 June 2025
You are already at the latest version
Abstract
Keywords:
I. Introduction
- A. GENERAL OVERFLOW OF credit card fraud detection

- 1.
- Data Collection
- 2.
- Data Processing
- 3.
- Feature Selection
- 4.
- Model Selection
- 5.
- Model Training
- 6.
- Model Evaluation
- 7.
- Model Deployment
- 8.
- Continuous Monitoring and Improvement
- B. OVERVIEW OF MACHINE LEARNING MODELS USED
-
Random Forest (RF):An ensemble model known for high accuracy and robustness against imbalanced datasets.
-
Logistic Regression (LR):
- A simple, fast model used to predict binary outcomes like fraud (yes/no) using a mathematical equation. Best for linearly separable data.
-
XGBoostA powerful and fast tree-based boosting algorithm that learns from mistakes to give better predictions. Great for large and imbalanced datasets

- C.TYPES OF CREDIT CARD FRAUD
-
Card TheftPhysical stealing of a credit card to make unauthorized purchases.
-
Card SkimmingUsing hidden devices at ATMs or payment terminals to steal card information.
-
PhishingFraudsters send fake emails, messages, or websites to trick users into giving up card details.
-
Card-Not-Present (CNP) FraudFraudulent transactions made online or over the phone without physical access to the card.
-
Account TakeoverHackers gain control of a credit card account by stealing login credentials and personal details.
-
Application FraudCriminals apply for a new credit card using stolen or fake identities.
-
hargeback Fraud (Friendly Fraud)A legitimate cardholder falsely claims a valid transaction was unauthorized to get a refund.
-
Counterfeit CardsFake cards created by cloning the information from a legitimate card’s magnetic stripe.
-
Mail TheftStealing credit card statements or new cards from a victim's mailbox.
-
Data BreachesLarge-scale theft of credit card information from businesses or online platforms.
II. Related Work
- Xuetong Niu et al. (2019): Proposed a fraud detection system using multiple machine learning algorithms. The system automatically identifies fraudulent transactions through training and testing on transaction data. Their approach emphasizes efficient transaction pattern recognition using historical transaction analysis. It demonstrates the effectiveness of combining multiple ML techniques. The study highlights the importance of data-driven insights in financial anomaly detection. The authors emphasize future work in incorporating real-time transaction monitoring.
- Emmanuel Ileberi et al. (2022): Developed a fraud detection engine using genetic algorithms (GA) for feature selection. Compared classifiers like Decision Tree, Random Forest, Logistic Regression, ANN, and Naive Bayes, achieving high accuracy with European credit cardholder data. Their GA approach optimizes relevant feature selection, reducing noise. This improves detection speed and accuracy across several ML models. The paper also investigates class imbalance challenges. Future work could enhance GA-driven ML pipelines with deep learning.
- Omkar Dabade et al. (2022): Designed a detection system using Random Forest, AdaBoost, and XGBoost combined through majority voting. Real-world banking data was used to validate the model's accuracy. The hybrid ensemble showed strong resilience to fraudulent data irregularities. Their method improves classification performance by leveraging multiple algorithm strengths. It outperforms standalone models on benchmark metrics. They recommend testing on more diverse datasets.
- Dr. K. Maithili et al. (2023): Focused on machine learning-based fraud detection. Highlighted the limitations of traditional rule-based systems and used data preprocessing to improve model accuracy. Their approach integrates data balancing techniques to optimize training results. The study emphasizes enhancing feature extraction and transformation. It highlights model robustness in handling evolving fraud tactics. Future improvements could explore ensemble models.
- Sreelekshmi S. & Shilpa A. (2023): Proposed a multi-algorithm fraud detection system that identifies fraudulent activities automatically using transaction data, enhancing detection with effective model training and testing. The study emphasizes the combination of classification and anomaly detection techniques. Model evaluation includes key metrics like sensitivity and specificity. The research supports scalable real-time fraud detection deployment. Future scope includes deep learning model experimentation.
- Syeda Farjana Farabi et al. (2024): Evaluated nine ML algorithms including Logistic Regression, Decision Trees, Random Forest, Naive Bayes, KNN, and ANN. Measured performance using accuracy, F1-score, sensitivity, and specificity. This comparative analysis helped in identifying the best-performing model. The research stressed the role of precision in fraud identification. Ensemble models emerged as top contenders. Further enhancement can come from feature optimization strategies.
- Yao Zou & Dawei Cheng (2025): Introduced a HOGRL model using mixture-of-expert attention and high-order graph learning. It outperformed baselines in fraud camouflage detection, recommending adaptive GNNs for future work. Their system uses advanced graph learning for relationship modelling. The attention mechanism prioritizes key features in detection. It effectively uncovers hidden fraud patterns. The study calls for continued research into graph-based fraud solutions.
- Mir Mohtasam Hossain Sisan et al. (2025): Studied ML-based real-time fraud detection using supervised and unsupervised methods. Suggested integrating AI identity systems with blockchain for secure financial systems. Their framework evaluates transaction legitimacy on-the-fly. This reduces decision latency in online payments. Blockchain integration offers added transparency and traceability. Their study promotes fusion of AI and cybersecurity techniques.
- Angel Jones & Marwan Omar (2025): Employed the LOF algorithm on unbalanced data for anomaly detection. Recommended further work on threshold tuning and integrating LOF with other ML methods. Their preprocessing pipeline improves detection accuracy. LOF showed robustness against minority class suppression. Model tuning significantly affected false positive rates. Future work includes real-time LOF deployment.
- Weddou Mohamedhen et al. (2025): Combined Federated Learning (FL), LSTM, and SMOTE for privacy-preserving, imbalanced data fraud detection. Suggested further tuning of FL parameters and enhancing privacy with differential privacy techniques. Their framework enables collaborative model training without data sharing. LSTM captured sequential transaction dependencies effectively. SMOTE balanced fraud class distribution. The approach promotes secure and accurate fraud systems
- Btoush et al. (2025): Similar to Weddou’s work, combined FL, LSTM, and SMOTE for effective fraud detection while preserving data privacy across financial institutions. Their system benefits from distributed intelligence. It ensures scalability and compliance with data protection laws. SMOTE further strengthened class balance. Their results recommend continuous model updates for evolving patterns.
- Kibet & Tonui (2025): Compared CNNs, LSTMs, and Autoencoders for fraud detection. Used SMOTE to handle class imbalance and found CNN+LSTM outperform traditional models. Their deep learning models captured spatial and temporal transaction features. Results highlight generalization and robustness. The study also focused on minimizing false positives. Future work includes hybrid architectures with blockchain.
- Ghosh Dastidar (2025): Proposed a context-aware fraud detection method using Neural Aggregate Generator (NAG) and GANs to generate synthetic data. Suggested using attention-based transformers in future work. The contextual approach improved fraud signature recognition. GANs enriched model learning with diverse data. The research promotes adaptive learning in fraud detection. Future extensions involve real-time transformer-based models.
- Lossan Bonde & Abdoul Karim Bichanga (2025): Developed a hybrid model combining CNN, GRU, and MLP with SMOTE-ENN. Achieved 100% accuracy and recommended developing real-time fraud detection systems. CNN extracted spatial features while GRU analyzed sequences. MLP acted as the final classifier. The SMOTE-ENN preprocessing balanced data and improved learning. Authors call for improved computation and deployment capabilities.
- Mniai Ayoub et al. (2025): Introduced GrCF, combining CBR and FRS with BGWO for better parameter tuning and feature selection. Demonstrated high speed and accuracy in detecting new fraud patterns. Granular computing handled complex feature sets efficiently. FRS filtered redundant features while BGWO optimized performance. The system dynamically learns evolving fraud behavior. It sets a foundation for real-time adaptive systems.
- Ahmed Samer et al. (2025): Reviewed the GrCF model by Ayoub et al. and analyzed its effectiveness compared to traditional ML methods, focusing on its feature selection and hyperparameter optimization. Their evaluation validated GrCF’s practical efficiency. The study highlights the importance of optimized parameter tuning. Compared with conventional methods, it showed better speed and reliability. The paper recommends expanding GrCF across varied fraud scenarios.
- Xuetong Niu et al. (2019) A credit card fraud detection system which employs several machine learning algorithms constitutes the main proposal of this research. The system seeks automatic fraudulent transaction detection through transaction data analysis. Testing and training procedures help the system identify regular transactions from fraudulent ones effectively.
III. Proposed Work(Methodology)
- THE FOLLOWING ARE THE STEPS FOR SYSTEM DEVELOPMENT:
- Data Collection: Collect the data about cred-it card transactions, such as the amount, time, location, and whether the transaction was fraud or not.
- Data Processing: Clean the data by removing any errors or missing information, and con-vert the data so that the model can under-stand it.
- Feature Selection: Select the most important features that help in predicting fraud and re-move the ones that are not useful.
- Model Selection and Training: Choose a machine learning model like Logistic Re-gression, Random Forest, or Neural Network, and train it using the past transaction data.
- Model Evaluation: Once the model is trained, test how well it works using evalua-tion metrics like accuracy, precision, to choose the best one.
- Model Deployment: After selecting the best model, deploy it so it can start checking credit card transactions for fraud.
- Continuous Monitoring and Improvement: Keep monitoring the model to make sure it is still performing well, and if needed, we im-prove it by giving it new data and retraining it.

- III.
- CHALLENGING IN CREDIT CARD FRAUD DETECTION
- The Credit Card Fraud Detection (CCFD) systems have become much better with machine learning; there remind numerous issues to be address. Some of the significant ones are:
-
Imbalance DatasetsFraudulent transactions are extremely rare compared to legitimate ones, making it difficult for fashions to learn meaningful fraud patterns without bias closer to majority classes.
-
Evolving Fraud processesFraudsters continuously exchange their strategies, requiring detection fashions to be often up to date to stay powerful in opposition to new and sophisticated fraudschemes.
-
High fake PositivesMany structures incorrectly flag legitimate transactions as fraudulent, main to consumer dissatisfaction and useless operational fees for financial establishments.
-
Real-Time Detection necessitiesreaching excessive accuracy while processing hundreds of thousands of transactions in real-time stays a technical and computational venture for fraud detection structures.
-
Statistics privateness and security worriesgaining access to and sharing sensitive transaction statistics for model training and checking out is regularly restricted because of strict privateness rules, restricting model overall performance and pass-institutional collaboration
- IV.
- FUTURE PROSPECTS IN CREDIT CARD FRAUD DETECTION(CCFD)
- The credit card fraud detection has its challenges, the future is promising. As technology and research advanced, CCFD system will become smarter, faster, and more accurate. Here are some promising future possibilities:
- a)
-
Enhanced Feature EngineeringSpecialists at Future Offline Fraud Detection Systems Will Find Clusters of Abnormal Data in Static Datasets More Effectively Through Transaction Analysis.
- b)
- Incorporation of Explainable AI (XAI) Banks need to know how their fraud detection models operate to approve systems that provide clear explanations about automatic actions.
- c)
- Federated Learning for Offline Datasets Online detection systems build more secure and dependable fraud prediction models by letting multiple institutions pool their analytical knowledge.
- d)
-
Synthetic Data Generation for Model TrainingWhen real fraudulent data is scarce GANs creates dependable fake records to help with offline model training
V. Conclusions
References
- M. Ayoub, T. Abdelhamid, and J. Khalid, "Granular computing framework for credit card fraud detection," Alexandria Engineering Journal, vol. 121, no. February, pp. 387–401, 2025. [CrossRef]
- D. Lunghi, Y. Molinghen, A. Simitsis, T. Lenaerts, and G. Bontempi, "FRAUD-RLA: A new reinforcement learning adversarial attack against credit card fraud detection," 2025, arXiv preprint. arXiv:2502.02290.
- Y. Zou and D. Cheng, "Effective High-order Graph Representation Learning for Credit Card Fraud Detection," in Proc. IJCAI, pp. 7581–7589, 2024.
- A. S. I. Al-Dulaimi, I. R. Abdelmaksoud, S. Abdelrazek, and H. M. El-Bakry, "An intelligent credit card fraud detection model using data mining and ensemble learning," Edelweiss Applied Science and Technology, vol. 9, no. 2, pp. 1391–1405, 2025.
- M. M. H. Sizan et al., "Advanced Machine Learning Approaches for Credit Card Fraud Detection in the USA: A Comprehensive Analysis," Journal of Ecohumanism, vol. 4, no. 2, pp. 883–905, 2025.
- A. Jones and M. Omar, "Unveiling the Potential of Local Outlier Factor in Credit Card Fraud Detection," International Journal of Informatics, Information System and Computer Engineering, pp. 1– 13, 2026.
- W. Mohamedhen and M. Charfeddine, "Enhanced Credit Card Fraud Detection Using Federated Learning, LSTM Models, and the SMOTE Technique," in Proc. ICAART, vol. 3, pp. 368–375, 2025.
- E. Btoush, X. Zhou, R. Gururajan, K. C. Chan, and O. Alsodi, "Achieving Excellence in Cyber Fraud Detection: A Hybrid ML+DL Ensemble Approach for Credit Cards," Applied Sciences, vol. 15, no. 3, 2025.
- E. Oztemel and M. Isik, "A Systematic Review of Intelligent Systems and Analytic Applications in Credit Card Fraud Detection," Applied Sciences, vol. 15, no. 3, pp. 1–22, 2025.
- European Commission, "No Title," Volume 4, no. 1, pp. 1–23, 2016.
- S. Siddhish and C. Sekaran, "Identifying the ideal machine learning model for credit card fraud detection," Volume 12, no. 12, pp. 1–10, 2024.
- M. Sahu and R. Prasad, "Credit Card Fraud Detection: Survey and Discussion," Volume 3404, no. 1, pp. 1–6, 2025.
- L. Bonde, "Improving Credit Card Fraud Detection with Ensemble Deep Learning-Based Models: A Hybrid Approach Using SMOTE," 2025.
- H. Zheng, "Federated Learning-Based Credit Card Fraud Detection: A Comparative Analysis of Advanced Machine Learning Models," Paper No. 01022, pp. 1–6, 2025.
- M. Tayebi and S. El Kafhali, "Generative Modeling for Imbalanced Credit Card Fraud Transaction Detection," Journal of Cybersecurity and Privacy, vol. 5, no. 1, pp. 1–36, 2025. [CrossRef]
- Y. R. K. Chakrabarti, "An intelligent framework for credit card fraud detection through data analytics," Volume 28, no. 1, pp. 139–149, 2025.
- [Y. Wu, L. Wang, H. Li, and J. Liu, "A Deep Learning Method of Credit Card Fraud Detection Based on Continuous-Coupled Neural Networks," Mathematics, vol. 13, no. 5, pp. 1–18, 2025.
- J. W. Alexander, "University of California, Los Angeles," Professional Geographer, vol. 9, no. 3, pp. 28–32, 1957.
- D. Salahudin-Mukeem and O. Ekundayo, "Hybrid Data Mining Technique for Credit Card Fraud Detection," Preprints, pp. 1–13, 2025.
- I. Y. Hafez, A. Y. Hafez, A. Saleh, A. A. Abd El-Mageed, and A. Abohany, "A systematic review of AI-enhanced techniques in credit card fraud detection," Journal of Big Data, vol. 12, no. 1, 2025.
- X. Fan and T. J. Boonen, "Machine Learning Algorithms for Credit Card Fraud Detection: Cost-Sensitive and Ensemble Learning Enhancements," unpublished manuscript, 2025.
- A. Hassan, A. Khader, J. Saudagar, S. Bhanja, and A. Das, "Data-Driven Methods for Credit Card Fraud Detection Using Machine Learning," Issue No. 3, 2021.
- I. P. Ojo and A. Tomy, "Explainable AI for credit card fraud detection: Bridging the gap between accuracy and interpretability," Volume 25, no. 2, pp. 1246–1256, 2025.
- A. Srivastava, A. Kundu, S. Sural, and A. Majumdar, "Credit Card Fraud Detection Using Hidden Markov Model," IEEE Transactions on Dependable and Secure Computing, vol. 5, no. 1, pp. 37–48, 2008.
- V. Dal Pozzolo, O. Caelen, R. A. Johnson, and G. Bontempi, "Calibrating Probability with Undersampling for Unbalanced Classification," in Proc. IEEE Symposium Series on Computational Intelligence, pp. 159–166, 2015.
- C. Whitrow, D. J. Hand, P. Juszczak, D. Weston, and N. M. Adams, "Transaction Aggregation as a Strategy for Credit Card Fraud Detection," Data Mining and Knowledge Discovery, vol. 18, no. 1, pp. 30–55, 2009.
- A. Bahnsen, D. Aouada, A. Stojanovic, and B. Ottersten, "Feature Engineering Strategies for Credit Card Fraud Detection," Expert Systems with Applications, vol. 51, pp. 134–142, 2016.
- J. West and M. Bhattacharya, "Intelligent Financial Fraud Detection: A Comprehensive Review," Computers & Security, vol. 57, pp. 47–66, 2016.
- P. Phua, V. Lee, K. Smith, and R. Gayler, "A Comprehensive Survey of Data Mining-Based Fraud Detection Research," arXiv preprint. arXiv:1009.6119, 2010.
| Author (s) | Year | Method/Fo cus | Outcome | Limitation |
| Xuetong Niu et al. | 2019 | Multiple ML Algorithms | Successfully detects fraud using combined models. Improves accuracy and automation. | May lack adaptability to evolving fraud techniques. |
| Omkar Dabade et al. | 2022 | RF, AdaBoost, XGBoost (Voting) | Ensemble methods improve fraud detection accuracy in real-world data. | Performance may drop on imbalanced datasets. |
| Emmanuel Ileberi et al. | 2022 | GA + DT, RF, LR, ANN, NB | Feature selection via GA boosts model performance using European dataset. | Limited validation on diverse geographies. |
| Dr. K. Maithili et al. | 2023 | ML with Data Preprocessing | Balancing and feature enhancement strengthens fraud detection. | Focuses mostly on preprocessing, less on model innovation. |
| Sreelekshmi S. & Shilpa A. | 2023 | Multiple ML Algorithms | Effective classification of fraud via supervised training/testing. | Lacks real-time application and hybrid techniques. |
| Syeda Farjana Farabi et al. | 2024 | LR, DT, RF, NB, KNN, ANN | Compared 9 ML models; RF and ensemble performed best. | Further model tuning and deeper feature engineering needed. |
| Yao Zou & Dawei Cheng | 2025 | HOGRL (Graph Learning) | Outperforms other models in camouflage fraud detection using graph learning. | Complexity in implementing adaptive GNN frameworks |
| Ahmed Samer et al. | 2025 | Review of GrCF Framework | Validates GrCF’s superiority over traditional ML systems. | Lacks practical implementation data in diverse settings. |
| Mniai Ayoub et al. | 2025 | GrCF: CBR + FRS + BGWO | Uses granular computing and optimization for faster fraud detection. | May require more tuning on diverse datasets. |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
