A. Dataset
This study uses the publicly available IEEE-CIS Fraud Detection Dataset as the primary experimental foundation. The dataset was jointly released by a financial services provider and a big data platform. It is widely used for algorithm evaluation in fraud detection research. The dataset simulates real-world online payment scenarios and contains around 12 million transaction records. Each transaction is labeled as fraudulent or not. The data include structured and semi-structured features such as timestamps, transaction amounts, device types, email domains, browser types, and address information.
The dataset is characterized by high feature dimensionality and a highly imbalanced distribution. Fraudulent transactions account for a small proportion of the total. It also contains complex cross-domain feature interactions. These characteristics place high demands on the generalization ability of detection models. Some features are anonymized, requiring models to automatically identify latent causal relationships. This supports the evaluation of causal representation learning methods in terms of effectiveness and adaptability.
In addition, the dataset includes both user identity features and transaction behavior features. This makes it suitable for building joint encoders to model fraudulent behavior. Because it closely simulates real-world scenarios and contains large-scale, high-dimensional data, this dataset has become a standard benchmark in fraud detection research. It is well-suited for testing the discrimination capabilities of advanced models in imbalanced, multimodal, and highly heterogeneous environments.
B. Experimental Results
First, this paper gives the comparative experimental results, as shown in
Table 1.
Overall, the proposed fraud detection method based on causal representation learning outperforms mainstream models across multiple evaluation metrics. This demonstrates the significant performance improvement potential of causal structure modeling in financial risk control tasks. Compared with traditional deep learning methods, this approach not only learns statistical relationships among transaction features but also uncovers latent causal mechanisms. This enhances both discrimination ability and generalization capacity.
In comparison, models such as Transformer and TabNet show strong feature modeling capabilities. However, they still rely on surface-level statistical correlations. Their robustness to distribution shifts and strategy variations remains limited. By introducing a causal encoder and domain-invariance constraints, the proposed method reduces dependency on specific distributions. It maintains high stability and accuracy in complex or changing fraud environments.
In addition, causal representations improve the model's adaptability to class imbalance and cross-domain feature disturbances. While other methods are often affected by abnormal patterns, the proposed model tends to learn the underlying generative mechanisms of fraud. This leads to better discrimination on edge cases. Such capability is especially important for real-world financial systems where fraud is adversarial and rapidly evolving. In summary, the experimental results validate the effectiveness and feasibility of introducing causal reasoning into fraud detection models. Compared with existing approaches, the proposed method not only achieves superior accuracy but also offers unique advantages in robustness and interpretability. This aligns well with the dual demands for transparency and reliability in financial applications.
Furthermore, this paper also presents generalization performance evaluation experiments under different domain distributions. These experiments are designed to assess the model's ability to maintain consistent performance when applied to data from varying financial scenarios with distinct feature distributions. The evaluation aims to reflect how well the model adapts to domain shifts and handles distributional discrepancies, which are common in real-world fraud detection tasks. By conducting these experiments, the study provides a comprehensive examination of the model’s robustness across multiple environments. The corresponding experimental setup and outcomes are illustrated in
Figure 2.
The results show that the proposed causal representation learning model demonstrates strong stability across various financial scenarios. In particular, it maintains high discrimination performance in typical settings such as E-Commerce, Banking, and Mobile Pay. This indicates that the transaction mechanisms captured by causal structures share a certain degree of commonality across different data distributions. As a result, the model exhibits cross-scenario generalization ability.
As the domain shifts, especially in settings like Crypto and Cross-Domain with significant distribution differences, model performance slightly declines but remains at a high level. This suggests that the model preserves key causal features under distribution drift or fraud strategy variation. It supports effective pattern recognition across domains. In contrast, traditional pattern-matching methods often fail in such cases. Causal representations offer stronger structural stability.
The experiments also reveal a clear trend in F1-Score variation. This suggests that even under extreme class imbalance, the model achieves a good balance between precision and recall. This is critical for financial risk control systems. It helps reduce false positives on legitimate users while identifying high-risk transactions.
Overall, the experiment confirms the effectiveness of causal representation learning in cross-domain fraud detection tasks. The model outperforms traditional deep models in environments with diverse transaction patterns and complex strategies. It provides a viable solution for large-scale real-world financial applications.
This paper further provides the experimental results of the model’s robustness evaluation conducted in a multi-strategy fraud simulation environment. The purpose of this experiment is to examine the model’s ability to maintain stable performance when facing various types of fraud strategies that differ in their behavioral patterns and levels of complexity. Such strategies are designed to simulate real-world scenarios where fraud tactics are dynamic, adaptive, and potentially adversarial. The experimental setup reflects the need for models to perform reliably under shifting strategic conditions, which is critical for practical deployment in financial systems. The details of this evaluation are presented in
Figure 3.
The experimental results shown in the figure demonstrate the robustness of the proposed causal representation learning model under various simulated fraud strategies. It can be observed that the model maintains relatively stable performance across different evaluation metrics. This holds for rapid fraud, gradual fraud, and more deceptive or adversarial strategies. These results indicate that the modeled causal structure captures common generative mechanisms across strategies rather than relying on superficial feature differences.
In dynamic fraud environments, traditional methods often depend on handcrafted features or static distributions. As a result, their performance fluctuates significantly when strategies change. In contrast, the proposed method models latent causal factors through causal representation learning. This gives the model stronger adaptability to strategy variations. Such adaptability improves the model's responsiveness to evolving fraud behaviors and ensures greater stability in real-world deployment.
In particular, under complex strategies such as adversarial fraud, the model shows a slight performance decline but still maintains high usability. This indicates that the proposed method possesses a certain level of resistance to interference. The key advantage of causal mechanism modeling lies in its ability to remove misleading superficial features. It helps the model focus on more essential discriminative factors.
In summary, this experiment further confirms the practical adaptability of causal representation learning in the context of fraud detection. This adaptability is particularly important in financial environments characterized by high levels of strategic variability and complex behavioral patterns. The ability to handle such complexity is essential for developing reliable detection systems capable of responding to rapidly evolving fraud tactics. By incorporating domain invariance and structural robustness into the model design, the approach enables a transition from basic generalization to enhanced robustness. This ensures that the model remains effective across a wide range of fraud strategies and diverse operational scenarios.
The figure shows that the loss function of the proposed model exhibits a stable and steadily decreasing trend during training. This indicates that the model progressively converges through iterations and effectively approaches the optimization objective. Both training loss and validation loss display strong consistency. This suggests that the model fits the training data well while maintaining good generalization on the validation set.
It is particularly noteworthy that both curves become stable in the middle and later stages. No significant oscillation or overfitting is observed. This indicates that the structural constraints and regularization mechanisms of causal representation learning play a key role during modeling. By introducing latent causal variables and domain-invariant discrimination mechanisms, the model reduces reconstruction error while preserving the stability of causal features.
In addition, the small gap between validation loss and training loss reflects the model’s adaptability to unseen data distributions. This confirms the robustness provided by the causal structure when facing strategy changes or domain differences. Such robustness supports risk control in complex transaction environments. It is especially important in real financial scenarios where fraud strategies change rapidly.
Overall, the loss curve validates the stability, convergence, and generalization ability of the proposed method during optimization. Through the structural advantages of causal modeling, the model not only learns feature representations efficiently but also effectively suppresses overfitting. This enhances its practical value and reliability in real-world fraud detection tasks.