Preprint
Article

This version is not peer-reviewed.

AI-Based Financial Transaction Monitoring and Fraud Prevention with Behaviour Prediction

A peer-reviewed version of this preprint was published in:
Applied and Computational Engineering 2024, 67(1), 76-82. https://doi.org/10.54254/2755-2721/67/2024ma0068

Submitted:

13 July 2024

Posted:

16 July 2024

You are already at the latest version

Abstract
In this study, we explored the application of deep learning techniques for credit card fraud detection, aiming to improve the performance and reliability of anomaly detection methods in financial transactions. We first utilized the Isolation Forest algorithm, achieving a detection accuracy of 26% for the top 1000 transactions. Subsequently, we experimented with the Autoencoder algorithm, an unsupervised deep neural network model, which enhanced the detection accuracy to 33.6% in the best case despite some fluctuations. The results demonstrate deep learning models' strong feature extraction capability and adaptability, highlighting their potential to surpass traditional methods. However, the high imbalance in the dataset, with only 0.17% of transactions being fraudulent, poses a significant challenge. This study underscores the necessity for further experimentation and optimization of network structures and hyperparameters to achieve more stable and efficient fraud detection. The findings provide valuable insights and reference points for future research in financial fraud detection using deep learning methodologies.
Keywords: 
;  ;  ;  

1. Introduction

Article 11 of the Measures for the Administration of Large Transactions and Suspicious Transaction Reports by Financial Institutions provides that “If a financial institution finds or has reasonable grounds to suspect that a customer, the customer’s funds or other assets, the customer’s transactions or attempted transactions are related to criminal activities such as money laundering or terrorist financing, it shall file a suspicious transaction report, regardless of the number of funds involved or the value of the assets involved.” [1,2]Financial institutions shall establish a sound transaction monitoring system to identify transactions that may involve money laundering or other upstream crimes by analyzing customer information and transaction information, and conduct further due diligence. If there are reasonable grounds for suspicion or the suspicion cannot be ruled out, the suspicious transaction report shall be reported to the China Anti-Money Laundering Monitoring and Analysis Centre and relevant departments. Through suspicious transaction monitoring, it can effectively detect and prevent the flow of illegal funds and play a role in safeguarding the security and stability of the financial system, combating criminal activities and maintaining social fairness and justice. Based on the prediction of transaction fraud based on financial market monitoring, this paper discusses some suggestions to improve the effectiveness, timeliness and integrity of suspicious transaction monitoring and identification from the common difficulties faced by financial institutions at present.

3. Application of AI Fraudulent Behaviour Prediction

In a world where transactions and interactions take place almost entirely online, the threat of fraud is paramount. As more and more financial transactions take place in the digital space, controls should be in place to ensure security. Artificial intelligence has proven to be an effective tool in the fight against fraud. Its function is based on learning from a sufficient amount of data and identifying patterns and biases in order to detect and prevent illegal behaviour.

3.1. Traditional fraud detection methods

Traditional rule-based fraud detection methods are very ineffective in today’s financial transaction environment. False positives and missed positives are the main reasons for this. Fraud detection through false positives is inaccurate, resulting in transactions being delayed before confirmation and requiring further investigation, causing inconvenience without providing any benefit. Under-reporting, on the other hand, is even more damaging, as financial institutions fail to prevent fraudulent activity, resulting in financial loss and reputational damage. The common disadvantage of both false positives and false negatives is that they rely on pre-defined rules that may not cover all possibilities, but cannot be modified due to their number. There is therefore a need for more intelligent and flexible fraud detection methods.
Second, data quality can negatively impact the performance of traditional fraud detection systems. Incomplete, incorrect or outdated data can compromise a system’s ability to adequately identify fraud patterns. Because of the volume and variety of data collected today, it is difficult to obtain high-quality data that can be properly interpreted. [14]However, ensuring that data sources are reliable and timely is critical to improving the outcomes of legacy systems. Generating high quality data is not easy, which is particularly important for organisations working with legacy systems and mixed data sources.
However, with the advent of artificial intelligence and machine learning technologies, financial services organisations have an opportunity to overcome these challenges. Artificial intelligence and machine learning techniques can help process large amounts of data quickly and in real time, identify subtle patterns that may indicate fraud, and adapt to new fraud strategies. Artificial intelligence and machine learning technologies use predictive modelling, natural language processing and anomaly detection techniques to help organisations improve the accuracy and efficiency of fraud detection.

3.2. Fraud detection with AI

Artificial intelligence plays an important role in fraud detection, using complex algorithms to analyse activity, identify anomalies and spot fraud in large data sets. AI systems learn from past experience, which in practice means they get better at predicting and identifying fraud over time by adapting to new technologies used by fraudsters. [15]This includes automated anomaly detection, behavioural analysis and natural language processing that can identify and evaluate trends and activities that may be indicators of fraud. AI fraud detection works by observing operations, taking an average of normal operations, and refining judgments to distinguish between correct and fraudulent operations in real time. By quickly processing large amounts of data, it can accurately identify subtle fraud patterns that can cause financial damage and maintain consumer confidence. In addition, AI technology can be used in the broad area of transaction verification, monitoring transactions and their myriad distinguishing features, and can also identify many of the signature characteristics used for identity theft using behavioural biometrics. Clearly, artificial intelligence in fraud detection is a highly effective tool for maintaining transaction security and preventing fraud losses.
The use of artificial intelligence and machine learning algorithms can revolutionise the way organisations in different industries identify and prevent fraud.
1. Predictive modelling
Artificial intelligence and machine learning algorithms can analyse historical data to predict the likelihood of future fraudulent activity. By identifying patterns and anomalies in data, predictive models can proactively identify potential fraud before it occurs, enabling organisations to take preventative action.
2. Anomaly detection
Artificial intelligence and machine learning techniques are good at identifying unusual patterns of behaviour that could indicate fraud. For example, a sudden change in customer behaviour, such as a large purchase from a new location, can be flagged as an indicator of potential fraud for further investigation and mitigation.
3. Natural Language Processing (NLP) [16]
NLP is another key area where artificial intelligence and machine learning play an important role in fraud detection. By analysing written communications such as emails and chat logs, these technologies can identify suspicious behaviour such as unusual language use or requests, helping to identify fraudulent activity at an early stage.
4. Machine Vision
Machine vision is a technology that uses computer vision to analyse images and video, which can be used to detect fraudulent activity such as counterfeit goods or to identify people in surveillance footage. This visual analysis capability enhances fraud detection in a variety of settings.
5. Keep learning
AI algorithms can be continuously trained with new data to improve their accuracy and effectiveness over time. This continuous learning approach ensures that fraud detection systems are always aware of the latest fraud trends and patterns, improving their overall effectiveness in identifying and preventing fraudulent activity.

3.3. Using Artificial Intelligence and Machine Learning Algorithms in Fraud Detection

In fraud detection, specific machine learning algorithms play a crucial role in identifying and preventing fraudulent activity. Here is an explanation of some of the key algorithms commonly used in fraud detection:
1. Logistic regression
Logistic regression is a fundamental algorithm in fraud detection and is particularly useful when the outcomes are categorical, such as determining whether a transaction is fraudulent or not. By fitting the data to a logical function, it can estimate the probabilities of different outcomes, providing insight into the likelihood of fraud based on specific parameters and historical data. Its simplicity and interpretability make it a valuable tool for analysing transaction data and identifying potentially fraudulent activity.
2. Decision Tree
Decision trees are multifunctional algorithms that excel at creating interpretable rules based on transaction characteristics. In fraud detection, decision trees are used to segment or classify data to predict the likelihood of fraud based on transaction characteristics such as amount, location and frequency. Their intuitiveness allows the creation of rule-based systems that can effectively identify suspicious transactions and flag them for further investigation.
3. Random Forest
Random forests represent an advance in fraud detection by using ensemble learning to improve accuracy and mitigate overfitting. By combining multiple decision trees, random forests aggregate predictions, resulting in more powerful and accurate fraud detection capabilities. Its ability to handle large data sets and complex patterns makes it particularly effective at identifying fraudulent activity in different trading environments, helping to improve risk mitigation strategies in the financial industry.
4. Neural Networks
Neural networks, inspired by the structure of the human brain, are powerful algorithms capable of learning complex patterns and relationships in data. In fraud detection, neural networks excel at efficiently processing large amounts of transactional data to detect anomalies, classify transactions and identify fraud patterns. Their ability to adapt and detect sophisticated fraud schemes makes them an indispensable tool in the ongoing fight against financial fraud, enabling organisations to stay ahead of emerging threats and protect their assets.
Overall, the integration of AI into fraud detection represents a significant step forward in securing digital transactions and increasing trust in online interactions. By harnessing the power of machine learning and data analytics, AI systems can constantly adapt to evolving fraud techniques and stay one step ahead of malicious actors. As AI technology continues to mature, we can expect fraud detection to become more accurate and efficient, further strengthening security measures across industries. However, addressing ethical issues and ensuring transparency in AI-driven fraud detection systems is critical to maintaining trust and accountability. Through ongoing research and collaboration between industry stakeholders, AI will continue to play a key role in enhancing security and fostering trust in the digital ecosystem.

4. Methodology

In recent years, deep learning has shown great potential in anomaly detection. In particular, deep learning methods excel when it comes to practical problems such as credit card fraud detection. By using deep learning algorithms, we are able to identify unusual transactions more effectively, helping financial institutions to reduce potential losses.

4.1. Experimental Design

In our study, we used a common credit card fraud dataset to evaluate the performance of different algorithms. First, we used the Isolation Forest algorithm, and the results show that the detection accuracy of top1000 can reach 26%. Although this result is satisfactory, we hope to explore more advanced deep learning methods in the hope of achieving better performance.
Next, we tried the Autoencoder algorithm, which is an unsupervised learning deep neural network model suitable for anomaly detection tasks. After several experiments, we found that Autoencoder was able to improve the detection accuracy of the top1000 to 33.6% in the best case. However, this result is subject to large fluctuations, and sometimes the detection accuracy can drop to around 25%.
Nevertheless, these experimental results show that the application of the autoencoder in credit card fraud detection has great potential. In order to further improve the stability and performance of the model, we need to conduct more experiments to explore and optimise more suitable network structures and hyperparameter settings. Through continuous experimentation and adjustment, we expect to find a more stable and efficient deep learning model that can better meet the challenge of anomaly detection.
The experimental part of this study will describe in detail the dataset, model structure, experimental process, and result analysis we adopted to provide a valuable reference for future research.

4.2. Data Processing

Preprints 112111 g001
Preprints 112111 g002
The data showed that only 0.17% of transactions were fraudulent. The data is very skewed. Let’s run our model without balancing first, and if we don’t get good accuracy then we can find a way to balance this data set. But first, let’s run the model without adjustment and only adjust the data if necessary.

4.3. Plot Correlation Matrix

Correlation matrices graphically give us an idea of how features relate to each other and can help us predict which features are most relevant to the prediction.
In the heat map we can clearly see that most features are not correlated with other features, but there are some features that are positively or negatively correlated with each other. For example, V2 and V5 are strongly negatively correlated with a feature called Amount. We also see some correlation with V20 and Amount. This gives us a deeper understanding of the data we have.

4.4. Experimental Result

Through this experiment, we have a deeper understanding of the application of deep learning in financial fraud detection. The results of the experiment show that while traditional isolated forest algorithms have performed satisfactorily in credit card fraud detection, achieving a top1000 detection accuracy of 26%, deep learning methods, especially Autoencoder algorithms, show greater potential. In the best case, Autoencoder’s top1000 detection accuracy improved to 33.6%, despite some fluctuations in its results.

4.5. Experimental Discussion

The advantages of deep learning methods in financial fraud detection are mainly reflected in the following aspects:
1. Strong feature extraction ability: Deep learning models can automatically extract complex features from data without manually designing features. This makes the model more adaptable in the face of high-dimensional, non-linear and complex data.
2. Strong adaptability: Deep learning models can better adapt to different data distributions and abnormal patterns by adjusting network structure and hyperparameters, thus improving detection accuracy.
3. High potential performance: Although the results of Autoencoder are volatile, further experiments and optimization are expected to find a more stable and efficient network structure, thus stably improving the detection performance.
During the experiment, we also found a significant bias in the dataset, with only 0.17% of transactions being fraudulent. We ran the model without balancing the data. If the detection accuracy is not ideal in this case, then we can consider balancing the data set. Through correlation matrix analysis, we understand the relationship between different features, which helps us understand which features are most important for prediction.
Overall, this study validates the potential of deep learning in financial fraud detection and provides a valuable reference for subsequent research. We believe that through continuous experimentation and optimization, deep learning models will be able to identify financial fraud more stably and efficiently, thereby helping financial institutions reduce potential losses and improve security.

5. Conclusion

In conclusion, this study demonstrates the significant potential of deep learning methods, particularly the Autoencoder algorithm, in the detection of financial fraud. Our experiments reveal that while traditional algorithms like the Isolation Forest can achieve satisfactory results, deep learning techniques offer superior feature extraction capabilities and adaptability to complex data patterns. Despite some fluctuations in performance, the Autoencoder achieved a top detection accuracy of 33.6%, indicating its promise for further optimization. This research underscores the importance of continuous experimentation and improvement in deep learning models to enhance the stability and efficiency of fraud detection systems, ultimately aiding financial institutions in mitigating risks and safeguarding their operations.
Looking ahead, the application of artificial intelligence (AI) in financial transaction monitoring and behaviour prediction has broad prospects and will greatly enhance the safety, stability and efficiency of the financial system in the future. First, the application of deep learning will greatly improve the accuracy and effectiveness of financial fraud detection. Deep learning algorithms, such as autoencoders and neural networks, are capable of processing complex non-linear data to extract valuable features. Through continuous optimisation, these algorithms will have higher detection accuracy and real-time capability to detect anomalous behaviour more effectively and reduce the false positive rate. Traditional financial transaction monitoring systems are often slow to respond, and advanced AI technology will change this. The financial regulatory system of the future will be able to perform real-time analysis and judgement at the moment a transaction occurs, quickly identifying and blocking suspicious transactions. This will not only significantly reduce the success rate of fraud, but also improve the security and stability of the entire financial system.
AI technology can therefore help financial institutions to share information and data more efficiently and promote international cooperation. Through harmonised technical standards and data interfaces, AI will facilitate the exchange of information across borders, thereby enhancing the overall capacity of global financial regulation.

References

  1. Shi, Y.; Yuan, J.; Yang, P.; Wang, Y.; Chen, Z. Implementing intelligent predictive models for patient disease risk in cloud data warehousing. Appl. Comput. Eng. 2024, 67, 34–40. [Google Scholar] [CrossRef]
  2. Zhan, T.; Shi, C.; Shi, Y.; Li, H.; Lin, Y. Optimization techniques for sentiment analysis based on LLM (GPT-3). Appl. Comput. Eng. 2024, 67, 41–47. [Google Scholar] [CrossRef]
  3. Lin, Y.; Li, A.; Li, H.; Shi, Y.; Zhan, X. GPU-Optimized Image Processing and Generation Based on Deep Learning and Computer Vision. J. Artif. Intell. Gen. Sci. (JAIGS) ISSN:3006-4023 2024, 5, 39–49. [Google Scholar] [CrossRef]
  4. Chen, Z.; Lou, Y.; Wang, B.; Lei, H.; Yang, P. Application of Cloud-Driven Intelligent Medical Imaging Analysis in Disease Detection. J. Theory Pr. Eng. Sci. 2024, 4, 64–71. [Google Scholar] [CrossRef]
  5. Wang, B.; Lei, H.; Shui, Z.; Chen, Z.; Yang, P. Current State of Autonomous Driving Applications Based on Distributed Perception and Decision-Making. World J. Innov. Mod. Technol. 2024, 7, 15–22. [Google Scholar] [CrossRef] [PubMed]
  6. Jiang, W.; Qian, K.; Fan, C.; Ding, W.; Li, Z. Applications of generative AI-based financial robot advisors as investment consultants. Appl. Comput. Eng. 2024, 67, 28–33. [Google Scholar] [CrossRef]
  7. Yang, J.; Qin, H.; Por, L.Y.; Shaikh, Z.A.; Alfarraj, O.; Tolba, A.; Elghatwary, M.; Thwin, M. Optimizing diabetic retinopathy detection with inception-V4 and dynamic version of snow leopard optimization algorithm. Biomed. Signal Process. Control. 2024, 96. [Google Scholar] [CrossRef]
  8. Fan, C.; Li, Z.; Ding, W.; Zhou, H.; Qian, K. Integrating artificial intelligence with SLAM technology for robotic navigation and localization in unknown environments. Appl. Comput. Eng. 2024, 67, 22–27. [Google Scholar] [CrossRef]
  9. Guo, L.; Li, Z.; Qian, K.; Ding, W.; Chen, Z. Bank Credit Risk Early Warning Model Based on Machine Learning Decision Trees. Journal of Economic Theory and Business Management 2024, 1(3), 24–30. [Google Scholar]
  10. Li, Z.; Fan, C.; Ding, W.; Qian, K. Robot Navigation and Map Construction Based on SLAM Technology. World J. Innov. Mod. Technol. 2024, 7, 8–14. [Google Scholar] [CrossRef]
  11. Fan, C.; Ding, W.; Qian, K.; Tan, H.; Li, Z. Cueing Flight Object Trajectory and Safety Prediction Based on SLAM Technology. J. Theory Pr. Eng. Sci. 2024, 4, 1–8. [Google Scholar] [CrossRef] [PubMed]
  12. Ding, W.; Tan, H.; Zhou, H.; Li, Z.; Fan, C. Immediate traffic flow monitoring and management based on multimodal data in cloud computing. Appl. Comput. Eng. 2024, 71, 1–6. [Google Scholar] [CrossRef]
  13. Zhou, C.; Zhao, Y.; Liu, S.; Zhao, Y.; Li, X.; Cheng, C. (2024). Research on Driver Facial Fatigue Detection Based on Yolov8 Model.
  14. Xin, Q.; Xu, Z.; Guo, L.; Zhao, F.; Wu, B. IoT traffic classification and anomaly detection method based on deep autoencoders. Appl. Comput. Eng. 2024, 69, 64–70. [Google Scholar] [CrossRef]
  15. Yang, T.; Li, A.; Xu, J.; Su, G.; Wang, J. Deep learning model-driven financial risk prediction and analysis. Appl. Comput. Eng. 2024, 67, 54–60. [Google Scholar] [CrossRef]
  16. Zhou, C.; Zhao, Y.; Zou, Y.; Cao, J.; Fan, W.; Zhao, Y.; Cheng, C. Predict Click-Through Rates with Deep Interest Network Model in E-commerce Advertising. arXiv 2024, arXiv:2406.10239. [Google Scholar]
  17. He, Z.; Shen, X.; Zhou, Y.; Wang, Y. Application of K-means clustering based on artificial intelligence in gene statistics of biological information engineering. BIC 2024: 2024 4th International Conference on Bioinformatics and Intelligent Computing (pp. 468-473).
  18. Gong, Y.; Zhu, M.; Huo, S.; Xiang, Y.; Yu, H. (2024, March). Utilizing Deep Learning for Enhancing Network Resilience in Finance. In 2024 7th International Conference on Advanced Algorithms and Control Engineering (ICAACE) (pp. 987-991). IEEE.
  19. Zhou, C. , Zhao, Y., Cao, J., Shen, Y., Gao, J., Cui, X.,... & Liu, H. (2024). Optimizing search advertising strategies: Integrating reinforcement learning with generalized second-price auctions for enhanced ad ranking and bidding. arXiv:2405.13381.
  20. Tian, J.; Li, H.; Qi, Y.; Wang, X.; Feng, Y. Intelligent medical detection and diagnosis assisted by deep learning. Appl. Comput. Eng. 2024, 64, 121–126. [Google Scholar] [CrossRef]
  21. Yang, P.; Chen, Z.; Su, G.; Lei, H.; Wang, B. Enhancing traffic flow monitoring with machine learning integration on cloud data warehousing. Appl. Comput. Eng. 2024, 67, 15–21. [Google Scholar] [CrossRef]
  22. Restrepo, D.; Wu, C.; Cajas, S. A.; Nakayama, L. F.; Celi, L. A. G.; Lopez, D. M. (2024). Multimodal Deep Learning for Low-Resource Settings: A Vector Embedding Alignment Approach for Healthcare Applications. medRxiv, 2024-06.
  23. Cajas, S. A. , Restrepo, D., Moukheiber, D., Kuo, K. T., Wu, C., Chicangana, D. S. G.,... & Celi, L. A. A multi-modal satellite imagery dataset for public health analysis in colombia.
  24. hang H, Diao S, Yang Y, Zhong J, Yan Y. Multi-scale image recognition strategy based on convolutional neural network. Journal of Computing and Electronic Information Management. 2024 Apr 30;12(3):107-13.
  25. Zhou, Y.; Zhan, T.; Wu, Y.; Song, B.; Shi, C. RNA secondary structure prediction using transformer-based deep learning models. Appl. Comput. Eng. 2024, 64, 88–94. [Google Scholar] [CrossRef]
  26. Liu, B.; Cai, G.; Ling, Z.; Qian, J.; Zhang, Q. Precise positioning and prediction system for autonomous driving based on generative artificial intelligence. Appl. Comput. Eng. 2024, 64, 42–49. [Google Scholar] [CrossRef]
  27. Cui, Z.; Lin, L.; Zong, Y.; Chen, Y.; Wang, S. Precision gene editing using deep learning: A case study of the CRISPR-Cas9 editor. Appl. Comput. Eng. 2024, 64, 134–141. [Google Scholar] [CrossRef]
  28. Rosner, B.; Tamimi, R.M.; Kraft, P.; Gao, C.; Mu, Y.; Scott, C.G.; Winham, S.J.; Vachon, C.M.; Colditz, G.A. Simplified Breast Risk Tool Integrating Questionnaire Risk Factors, Mammographic Density, and Polygenic Risk Score: Development and Validation. Cancer Epidemiology Biomarkers Prev. 2020, 30, 600–607. [Google Scholar] [CrossRef] [PubMed]
  29. Wang, B.; He, Y.; Shui, Z.; Xin, Q.; Lei, H. Predictive optimization of DDoS attack mitigation in distributed systems using machine learning. Appl. Comput. Eng. 2024, 64, 95–100. [Google Scholar] [CrossRef]
  30. Zhan, X.; Ling, Z.; Xu, Z.; Guo, L.; Zhuang, S. Driving Efficiency and Risk Management in Finance through AI and RPA. Unique Endeavor in Business & Social Sciences 2024, 3, 189–197. [Google Scholar]
  31. Xu, Z.; Guo, L.; Zhou, S.; Song, R.; Niu, K. Enterprise Supply Chain Risk Management and Decision Support Driven by Large Language Models. Applied Science and Engineering Journal for Advanced Research 2024, 3, 1–7. [Google Scholar]
  32. Song, R.; Wang, Z.; Guo, L.; Zhao, F.; Xu, Z. (2024). Deep Belief Networks (DBN) for Financial Time Series Analysis and Market Trends Prediction.
Figure 1. Results of training matrix.
Figure 1. Results of training matrix.
Preprints 112111 g003
Figure 2. Model diagram of training results.
Figure 2. Model diagram of training results.
Preprints 112111 g004
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

Disclaimer

Terms of Use

Privacy Policy

Privacy Settings

© 2026 MDPI (Basel, Switzerland) unless otherwise stated