Submitted:
20 January 2026
Posted:
20 January 2026
You are already at the latest version
Abstract
Keywords:
1. Introduction
2. Overview of Graph-based Anomaly Detection for Intrusion Detection
2.1. Overview of Anomaly Detection
2.1.1. Definitions of Anomaly Detection
2.1.2. Definitions of Graph-based Anomaly Detection
2.2. Anomaly Detection Approaches for Intrusion Detection
2.2.1. Conventional Anomaly Detection Approaches:
2.2.2. Graph-based Anomaly Detection Approaches:
2.3. GBAD Workflow for Intrusion Detection
3. Graph Construction
3.1. Network Traffic Data Graphs
3.2. Network Logs Graph
3.3. Provenance Graph
3.4. Audit data Graphs
3.5. Host Logs Graphs
3.6. Controller Area Network graph
3.7. Multi-Modal Data Graphs
4. Graph Pre-processing
4.1. Graph Data Reduction
4.2. Graph Data Transformation
4.3. Graph Feature Extraction
4.4. Graph Encoding
4.5. Graph Representation Learning (GRL)
4.5.1. Node Embedding
4.5.2. Edge Embedding
4.5.3. Graph/subgraph embedding
4.5.4. Structural and Temporal Feature Embedding
5. Graph-based Anomaly Detection
5.1. Two-stage Anomaly Detection
5.1.1. Classification Algorithms
5.1.2. Clustering Algorithms
5.1.3. Outlier Detection Algorithms
5.1.4. Deep Learning (DL)
5.1.5. AutoEncoders
5.1.6. Scoring-based Methods
5.1.7. Ensemble Methods
5.1.8. Hybrid Anomaly Detection Methods
5.2. Graph-based End-to-end anomaly detection
5.2.1. Graph Neural Networks
5.2.2. Graph Clustering
5.2.3. Graph Divergence Analysis
5.2.4. Graph AutoEncoder
5.2.5. Other techniques
5.3. Comparative Analysis: Two-stage vs End-to-end Approaches
6. Evaluation and Post-detection Analysis
6.1. Performance Evaluation
6.1.1. Datasets for Evaluation
6.1.2. Evaluation Metrics
6.1.3. Evaluation Methods
6.2. Post-detection analysis
6.2.1. Interpretability and Explainability in Graph-based Anomaly Detection
6.2.2. Advanced Analysis and Interpretation of Detection Results
7. Conclusion and Future Opportunities
7.1. Scalability of processing large-scale graphs
7.2. Interpretability of graph anomalies
7.3. Challenges in real-time detection
7.4. Data imbalance for graph-based anomaly detection
7.5. Camouflaged, stealthy attack detection
7.6. Lack of robustness against adversarial attacks
Abbreviations
| IDS | Intrusion Detection System |
| A-IDS | Anomaly based Intrusion Detection System |
| FPR | False Positive Rate |
| GBAD | Graph-based Anomaly Detection |
| GNN | Graph Neural Network |
| CAN | Controller Area Network |
| XAI | eXplainable Artificial Intelligence |
| ML | Machine Learning |
| GRL | Graph Representation Learning |
| GCN | Graph Convolutional Network |
| MLP | Multi-Layer Perception |
8. Appendix
Appendix A. Data Capturing and Pre-processing
| Data Source | Data category | Related works | |
|---|---|---|---|
| Network | Network traffic data | Packet data | [8,12,55,57,62,65,66,67,69,70,71,79,71] |
| Flow data | [58,60,73,74,75,76,77,78] | ||
| Network logs | [19,51,81,83,84,85,51,83] | ||
| CAN messages | [98,99,106] | ||
| Host | Log data | Host logs | [37] |
| System logs | [87,91] | ||
| Audit data | [89,90,93,94,95,96,97] | ||
| System calls | [7,46,49,52,59,86,88,92] | ||
| Multi-modal | Network events and host logs | [80] | |
| Micro-service metrics, traces and logs | [101,107] | ||
| Microservice traces and log events | [108] | ||
Appendix A.1. Network Data
Appendix A.1.1. Network Traffic Data
Appendix A.1.2. Network Logs
Appendix A.1.3. Controller Area Network
Appendix A.2. Host Level Data
Appendix A.2.1. Host Logs
Appendix A.2.2. Audit Data
Appendix A.2.3. System Calls
Appendix A.3. Multi-Modal Data
Appendix B. Datasets Summary
Appendix B.1. Network Level Datasets
Appendix B.2. Host Level Datasets
Appendix B.3. Multi-modal Datasets
Appendix C. Evaluation Metrics Summary
Appendix C.1. Confusion Matrix for Intrusion Detection

Appendix C.2. Performance Metrics
- 1.
- FPR (also known as the false alarm rate (FAR)) is the proportion of normal instances incorrectly classified as anomalies. FPR =
- 2.
- FNR is the proportion of anomalies incorrectly classified as normal. FNR =
- 3.
- TNR is the proportion of normal instances correctly identified. TNR =
- 4.
- Accuracy =
- 5.
- Precision =
- 6.
- Recall (TPR or Sensitivity or Detection Rate) =
- 7.
- F1-score =
-
where is the number of total attack blocks, is the number of attack blocks detected by IDS, is the total number of normal blocks and is the total number of normal blocks detected by IDS [99].
- 1.
- Macro F1-score = , where N is the number of classes and is the F1-score for class i.
- 2.
- Weighted F1-score = , where N is the number of classes number of true instances in class i, n the total number of instances and is the F1-score for class i.
- 3.
- Balanced Accuracy = , which calculates the average recall for both positive and negative classes.
- 4.
- Matthews correlation coefficient (Mathews CC) = , Mathews CC is used to evaluate binary classification in imbalanced scenarios, and it ranges between -1 and +1, where +1 indicates perfect classification, 0 indicates random classification, and -1 indicates complete misclassifications.
Appendix D. Reviewed Papers Summary
| Proposed System | Data Source | Graph Used | Phase 1- Graph Pre-process, Feature Extraction and GRL | Phase 2- Anomaly Detection | Post Analysis |
|---|---|---|---|---|---|
| Xiao et al. [8] | Network | Network traffic graph | Node embedding | Classification | ✗ |
| GODIT [57] | Network | Network traffic graph | Feature extraction, Encoding | Outlier detection | ✗ |
| Munoz et al. [58] | Network | Network traffic graph | Feature extraction | Deep learning | ✗ |
| Fu et al. [55] | Network | Network traffic graph | Structural and temporal embedding | Ensemble method | ✗ |
| Gao et al. [67] | Network | Network traffic graph | Node embedding | Clustering | ✗ |
| Hu et al. [60] | Network | Network flow graph | Graph / subgraph embedding | Ensemble method | ✗ |
| Messai et al. [62] | Network | Network flow graph | Node embedding | Deep learning | ✗ |
| Friji et al. [74] | Network | Network flow graph | Data transformation, Spatial and non-spatial embeddings | Neural network | ✗ |
| DLGNN [79] | Network | Network flow graph | Structural (edge) and temporal embeding | Deep learning | ✗ |
| Anomal-E [76] | Network | Network flow graph | Edge embedding | Classification /Clustering / Outlier detection | ✗ |
| X-CBA [77] | Network | Network flow graph | Edge embedding | ML classification | ME |
| Meng et al. [51] | Network | Network logs graph | Data reduction, Feature extraction, Encoding | Outlier detection | ✗ |
| Sec2graph [19] | Network | Network logs graph | Encoding | Deep learning | ✗ |
| GenGLAD [81] | Network | Network log graph | Feature extraction, Encoding | Clustering | ✗ |
| Wrongdoing Monitor [109] | Network | Property graph | Node embedding | Classification | ✗ |
| PROVDETECTOR [7] | Host | Provenance graph | Feature extraction, Graph embedding | Clustering | ✗ |
| UNICORN [49] | Host | Provenance graph | Data transformation, Encoding | Clustering | ✗ |
| ANUBIS [46] | Host | Provenance graph | Feature extraction, Encoding | Deep learning | IR |
| PROGRAPHER [52] | Host | Provenance graph | Graph embedding | Deep learning | IR |
| EdgeTorrent [92] | Host | Provenance graph | Encoding, Node embedding | Deep learning | ✗ |
| OC-DHetGNN [103] | Host | Provenance graph | Node and graph embedding | Deep learning | ✗ |
| GCA [86] | Host | Provenance graph | Graph / subgraph embedding | Deep learning | ✗ |
| Lakha et al. [88] | Host | Provenance graph | Encoded features, Node embedding | Classification | ✗ |
| R-CAID [93] | Host | Provenance Graph | Node Embedding | Clustering | ✗ |
| Flash [89] | Host | Provenance Graph | Encoding, Node Embedding | Classification | IR |
| MAGIC [90] | Host | Provenance Graph | Node Embedding | Outlier detection | ✗ |
| Cao et al. [95] | Host | Audit data graph | Feature extraction | Classification | ✗ |
| PG-AIDS [104] | Host | Provenance Graph | Path Embedding | Deep Learning | ✗ |
| CTLMD [96] | Host | Audit data graph | Structural (node) and temporal embedding | Classification | ✗ |
| Federated GNN [98] | CAN | CAN graph | Feature generation | Autoencoder | ✗ |
| GB-IDS [106] | CAN | CAN graph | Feature generation | Graph Encoder Decoder | ✗ |
| Islam et al. [100] | CAN | CAN graph | Feature extraction | Statistical analysis | ✗ |
| Proposed System | Data Source | Graph Used | End to End Anomaly Detection Approach | Post Analysis |
|---|---|---|---|---|
| TPE-NIDS [65] | Network | Network traffic graph | GNN based on edge embedding | ✗ |
| FAPDD [69] | Network | Network traffic graph | Graph kernel based | ✗ |
| Network AD [66] | Network | Network traffic graph | Graph analysis with time series analysis | ✗ |
| GTAE-IDS [70] | Network | Network traffic graph | Graph autoEncoder | ✗ |
| GRID [71] | Network | Network traffic graph | GNN with attention mechanism | ✗ |
| Villegas et al. [68] | Network | Network traffic graph | GNN | ✗ |
| AnomGraphAdv [72] | Network | Network traffic graph | Graph autoEncoder | ✗ |
| E-GraphSAGE [75] | Network | Network flow graph | GNN based on edge embedding | ✗ |
| HyperVision [73] | Network | Network flow graph | Graph clustering | ✗ |
| FeCoGraph [78] | Network | Network flow graph | GNN with contrastive learning | ✗ |
| KnowGraph [82] | Network | Network flow graph | Knowledge-enabled GNN | ✗ |
| RShield [84] | Network | Network logs graph | GNN based on node embedding | ✗ |
| Kisanga et al. [83] | Network | Network logs graph | GNN | ✗ |
| MIMC [85] | Network | Network logs graph | Graph clustering | ✗ |
| Wei et al. [12] | Network | Directed graph | GNN with structure learning | IAR |
| Attack Miner [91] | Host | Provenance graph | GNN based on attention to neighbor nodes | ✗ |
| LogTracer [87] | Host | Provenance graph | Anomaly scoring based | ✗ |
| AJSAGE [94] | Host | Provenance graph | GNN | ✗ |
| MEWRGNN [97] | Host | Audit data graph | GNN | ✗ |
| Logs2Graphs [37] | Host | Host logs graph | GNN based on nodes and edge feature learning | IAR |
| KAIROS [59] | Host | Provenance graph | GNN with a temporal model | IR |
| WSG-InV [99] | CAN | CAN graph | Subgraph analysis | ✗ |
| BERTHTLG [108] | Multi-modal | Multi-modal data graph | GNN based on anomaly scoring | ✗ |
| Li et al. [80] | Multi-modal | Network logs graph, Provenance graph | Novel attention based GNN approach | ✗ |
| MSTGAD [101] | Multi-modal | Multi-modal data graph | Graph encoder-decoder approach | ✗ |
| TraceGra [107] | Multi-modal | Multi-modal data graph | GNN with a temporal model | ✗ |
| StrGNN [63] | N/A | Dynamic graph | GNN with a temporal model | ✗ |
| TSAD [64] | N/A | Dynamic graph | Community and evolutionary based | ✗ |
| 1 |
References
- startupdaily. The DP World logistics cyber attack looks like sabotage by a ‘foreign state actor’. 2019. [Google Scholar]
- Council, A.R. DP240101547 — Griffith University. 2019. [Google Scholar]
- Zuech, R.; Khoshgoftaar, T.M.; Wald, R. Intrusion detection and big heterogeneous data: a survey. Journal of Big Data 2015, 2, 1–41. [Google Scholar] [CrossRef]
- Reports.; Data. Anomaly Detection Market. 2022.
- Azeez, N.A.; Bada, T.M.; Misra, S.; Adewumi, A.; Van der Vyver, C.; Ahuja, R. Intrusion detection and prevention systems: an updated review. Data Management, Analytics and Innovation: Proceedings of ICDMAI 2019, Volume 1 2020, pp. 685–696.
- Samrin, R.; Vasumathi, D. Review on anomaly based network intrusion detection system. In Proceedings of the 2017 international conference on electrical, electronics, communication, computer, and optimization techniques (ICEECCOT); IEEE, 2017; pp. 141–147. [Google Scholar]
- Wang, Q.; Hassan, W.U.; Li, D.; Jee, K.; Yu, X.; Zou, K.; Rhee, J.; Chen, Z.; Cheng, W.; Gunter, C.A.; et al. You Are What You Do: Hunting Stealthy Malware via Data Provenance Analysis. In Proceedings of the NDSS, 2020. [Google Scholar]
- Xiao, Q.; Liu, J.; Wang, Q.; Jiang, Z.; Wang, X.; Yao, Y. Towards network anomaly detection using graph embedding. In Proceedings of the Computational Science–ICCS 2020: 20th International Conference, Amsterdam, The Netherlands, June 3–5, 2020, Proceedings, Part IV 20; Springer, 2020; pp. 156–169. [Google Scholar]
- Garcia-Teodoro, P.; Diaz-Verdejo, J.; Maciá-Fernández, G.; Vázquez, E. Anomaly-based network intrusion detection: Techniques, systems and challenges. computers & security 2009, 28, 18–28. [Google Scholar]
- Lee, J.; Park, K. GAN-based imbalanced data intrusion detection system. Personal and Ubiquitous Computing 2021, 25, 121–128. [Google Scholar] [CrossRef]
- Zengy, J.; Wang, X.; Liu, J.; Chen, Y.; Liang, Z.; Chua, T.S.; Chua, Z.L. Shadewatcher: Recommendation-guided cyber threat analysis using system audit records. In Proceedings of the 2022 IEEE Symposium on Security and Privacy (SP); IEEE, 2022; pp. 489–506. [Google Scholar]
- Wei, C.; Xie, G.; Diao, Z. Network Flow Based IoT Anomaly Detection Using Graph Neural Network. In Proceedings of the International Conference on Knowledge Science, Engineering and Management; Springer, 2023; pp. 432–445. [Google Scholar]
- Milajerdi, S.M.; Gjomemo, R.; Eshete, B.; Sekar, R.; Venkatakrishnan, V. Holmes: real-time apt detection through correlation of suspicious information flows. In Proceedings of the 2019 IEEE Symposium on Security and Privacy (SP); IEEE, 2019; pp. 1137–1152. [Google Scholar]
- Kim, J. Studies on Inspecting Encrypted Data: Trends and Challenges. Journal of Wireless Mobile Networks, Ubiquitous Computing, and Dependable Applications (JoWUA) 2023, 189–199. [Google Scholar] [CrossRef]
- Ren, J.; Xia, F.; Lee, I.; Noori Hoshyar, A.; Aggarwal, C. Graph learning for anomaly analytics: Algorithms, applications, and challenges. ACM Transactions on Intelligent Systems and Technology 2023, 14, 1–29. [Google Scholar] [CrossRef]
- Paudel, R.; Tharp, L.; Kaiser, D.; Eberle, W.; Gannod, G. Visualization of Anomalies using Graph-Based Anomaly Detection. In Proceedings of the The International FLAIRS Conference Proceedings; 2021; Vol. 34. [Google Scholar]
- Hong, Y.; Shi, C.; Chen, J.; Wang, H.; Wang, D. Multitask Asynchronous Metalearning for Few-Shot Anomalous Node Detection in Dynamic Networks. IEEE Transactions on Computational Social Systems; 2024. [Google Scholar]
- Akoglu, L.; Tong, H.; Koutra, D. Graph-based Anomaly Detection and Description: A Survey. 2014. [Google Scholar] [CrossRef]
- Leichtnam, L.; Totel, E.; Prigent, N.; Mé, L. Sec2graph: Network attack detection based on novelty detection on graph structured data. In Proceedings of the Detection of Intrusions and Malware, and Vulnerability Assessment: 17th International Conference, DIMVA 2020, Lisbon, Portugal, June 24–26, 2020, Proceedings 17; Springer, 2020; pp. 238–258. [Google Scholar]
- Guo, D.; Liu, Z.; Li, R. RegraphGAN: A graph generative adversarial network model for dynamic network anomaly detection. Neural Networks 2023, 166, 273–285. [Google Scholar] [CrossRef] [PubMed]
- Pourhabibi, T.; Ong, K.L.; Kam, B.H.; Boo, Y.L. Fraud detection: A systematic literature review of graph-based anomaly detection approaches. Decision Support Systems 2020, 133, 113303. [Google Scholar] [CrossRef]
- Li, Z.; Chen, Q.A.; Yang, R.; Chen, Y.; Ruan, W. Threat detection and investigation with system-level provenance graphs: a survey. Computers & Security 2021, 106, 102282. [Google Scholar] [CrossRef]
- Kim, H.; Lee, B.S.; Shin, W.Y.; Lim, S. Graph anomaly detection with graph neural networks: Current status and challenges. IEEE Access 2022. [Google Scholar] [CrossRef]
- Pazho, A.D.; Noghre, G.A.; Purkayastha, A.A.; Vempati, J.; Martin, O.; Tabkhi, H. A Survey of Graph-based Deep Learning for Anomaly Detection in Distributed Systems. arXiv 2022, arXiv:2206.04149. [Google Scholar] [CrossRef]
- Lagraa, S.; Husák, M.; Seba, H.; Vuppala, S.; State, R.; Ouedraogo, M. A review on graph-based approaches for network security monitoring and botnet detection. International Journal of Information Security 2023, 1–22. [Google Scholar] [CrossRef]
- Bilot, T.; El Madhoun, N.; Al Agha, K.; Zouaoui, A. Graph Neural Networks for Intrusion Detection: A Survey. IEEE Access 2023. [Google Scholar] [CrossRef]
- Zhong, M.; Lin, M.; Zhang, C.; Xu, Z. A Survey on Graph Neural Networks for Intrusion Detection Systems: Methods, Trends and Challenges. Computers & Security 2024, 103821. [Google Scholar]
- Eltanbouly, S.; Bashendy, M.; AlNaimi, N.; Chkirbene, Z.; Erbad, A. Machine learning techniques for network anomaly detection: A survey. In Proceedings of the 2020 IEEE International Conference on Informatics, IoT, and Enabling Technologies (ICIoT); IEEE, 2020; pp. 156–162. [Google Scholar]
- Kwon, D.; Kim, H.; Kim, J.; Suh, S.C.; Kim, I.; Kim, K.J. A survey of deep learning-based network anomaly detection. Cluster Computing 2019, 22, 949–961. [Google Scholar] [CrossRef]
- Moustafa, N.; Hu, J.; Slay, J. A holistic review of network anomaly detection systems: A comprehensive survey. Journal of Network and Computer Applications 2019, 128, 33–55. [Google Scholar] [CrossRef]
- Nassif, A.B.; Talib, M.A.; Nasir, Q.; Dakalbab, F.M. Machine Learning for Anomaly Detection: A Systematic Review. 2021. [Google Scholar] [CrossRef]
- Chalapathy, R.; Chawla, S. Deep Learning for Anomaly Detection: A Survey. 2019. [Google Scholar] [CrossRef]
- Patcha, A.; Park, J.M. An overview of anomaly detection techniques: Existing solutions and latest technological trends. Computer networks 2007, 51, 3448–3470. [Google Scholar] [CrossRef]
- Genereux, S.J.; Lai, A.K.; Fowles, C.O.; Roberge, V.R.; Vigeant, G.P.; Paquet, J.R. MAIDENS: MIL-STD-1553 anomaly-based intrusion detection system using time-based histogram comparison. IEEE transactions on aerospace and electronic systems 2019, 56, 276–284. [Google Scholar] [CrossRef]
- Sensarma, D.; Sarma, S.S. A survey on different graph based anomaly detection techniques. Indian J Sci Technol 2015, 8, 1–7. [Google Scholar] [CrossRef]
- Lamichhane, P.B.; Eberle, W. Anomaly Detection in Graph Structured Data: A Survey. arXiv 2024, arXiv:2405.06172. [Google Scholar] [CrossRef]
- Li, Z.; Shi, J.; van Leeuwen, M. Graph Neural Network based Log Anomaly Detection and Explanation. arXiv 2023, arXiv:2307.00527. [Google Scholar]
- Fernandes, G.; Rodrigues, J.J.; Carvalho, L.F.; Al-Muhtadi, J.F.; Proença, M.L. A comprehensive survey on network anomaly detection. 2019. [Google Scholar] [CrossRef]
- Hajj, S.; El Sibai, R.; Bou Abdo, J.; Demerjian, J.; Makhoul, A.; Guyeux, C. Anomaly-based intrusion detection systems: The requirements, methods, measurements, and datasets. Transactions on Emerging Telecommunications Technologies 2021, 32, e4240. [Google Scholar] [CrossRef]
- Habeeb, R.A.A.; Nasaruddin, F.; Gani, A.; Hashem, I.A.T.; Ahmed, E.; Imran, M. Real-time big data processing for anomaly detection: A survey. International Journal of Information Management 2019, 45, 289–307. [Google Scholar] [CrossRef]
- Chandola, V.; Banerjee, A.; Kumar, V. Anomaly detection: A survey. ACM computing surveys (CSUR) 2009. [Google Scholar] [CrossRef]
- Ahmed, M.; Mahmood, A.N.; Hu, J. A survey of network anomaly detection techniques. Journal of Network and Computer Applications 2016, 60, 19–31. [Google Scholar] [CrossRef]
- Marnerides, A.K.; Schaeffer-Filho, A.; Mauthe, A. Traffic anomaly diagnosis in Internet backbone networks: A survey. Computer Networks 2014, 73, 224–243. [Google Scholar] [CrossRef]
- Adhikari, D.; Jiang, W.; Zhan, J.; Rawat, D.B.; Bhattarai, A. Recent advances in anomaly detection in Internet of Things: Status, challenges, and perspectives. Computer Science Review 2024, 54, 100665. [Google Scholar] [CrossRef]
- Anagnostopoulos, C. Weakly supervised learning: how to engineer labels for machine learning in cyber-security. In Data Science for Cyber-Security; World Scientific, 2019; pp. 195–226. [Google Scholar]
- Anjum, M.M.; Iqbal, S.; Hamelin, B. ANUBIS: a provenance graph-based framework for advanced persistent threat detection. In Proceedings of the Proceedings of the 37th ACM/SIGAPP Symposium on Applied Computing; 2022; pp. 1684–1693. [Google Scholar]
- Pang, G.; Shen, C.; Cao, L.; Hengel, A.V.D. Deep learning for anomaly detection: A review. ACM computing surveys (CSUR) 2021, 54, 1–38. [Google Scholar] [CrossRef]
- Khraisat, A.; Gondal, I.; Vamplew, P.; Kamruzzaman, J. Survey of intrusion detection systems: techniques, datasets and challenges. Cybersecurity 2019, 2, 1–22. [Google Scholar] [CrossRef]
- Han, X.; Pasquier, T.; Bates, A.; Mickens, J.; Seltzer, M. Unicorn: Runtime provenance-based detector for advanced persistent threats. arXiv 2020, arXiv:2001.01525. [Google Scholar] [CrossRef]
- Lanvin, M.; Gimenez, P.F.; Han, Y.; Majorczyk, F.; Mé, L.; Totel, É. Detecting APT through graph anomaly detection. In Proceedings of the RESSI 2022-Rendez-Vous de la Recherche et de l’Enseignement de la Sécurité des Systèmes d’Information; 2022; pp. 1–3. [Google Scholar]
- Meng, Q.; Wang, H.; Oo, N.; Lim, H.W.; Schätz, B.J.; Sikdar, B. Graph-Based Attack Path Discovery for Network Security. [CrossRef]
- Yang, F.; Xu, J.; Xiong, C.; Li, Z.; Zhang, K. {PROGRAPHER}: An Anomaly Detection System based on Provenance Graph Embedding. In Proceedings of the 32nd USENIX Security Symposium (USENIX Security 23); 2023; pp. 4355–4372. [Google Scholar]
- Jain, P.; Jain, S.; Zaïane, O.R.; Srivastava, A. Anomaly detection in resource constrained environments with streaming data. IEEE Transactions on Emerging Topics in Computational Intelligence 2021, 6, 649–659. [Google Scholar] [CrossRef]
- Ouarbya, L.; Rahul, M. Interpretable Anomaly Detection: A Hybrid Approach Using Rule-Based and Machine Learning Techniques. 2023. [Google Scholar] [CrossRef]
- Fu, Z.; Liu, M.; Qin, Y.; Zhang, J.; Zou, Y.; Yin, Q.; Li, Q.; Duan, H. Encrypted malware traffic detection via graph-based network analysis. In Proceedings of the Proceedings of the 25th International Symposium on Research in Attacks, Intrusions and Defenses; 2022; pp. 495–509. [Google Scholar]
- Xing, J.; Wu, C. Detecting anomalies in encrypted traffic via deep dictionary learning. In Proceedings of the IEEE INFOCOM 2020-IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS); IEEE, 2020; pp. 734–739. [Google Scholar]
- Paudel, R.; Muncy, T.; Eberle, W. Detecting dos attack in smart home iot devices using a graph-based approach. In Proceedings of the 2019 IEEE international conference on big data (big data); IEEE, 2019; pp. 5249–5258. [Google Scholar]
- Muñoz, D.C.; Valiente, A.d.C. A novel botnet attack detection for IoT networks based on communication graphs. Cybersecurity 2023, 6, 33. [Google Scholar] [CrossRef]
- Cheng, Z.; Lv, Q.; Liang, J.; Wang, Y.; Sun, D.; Pasquier, T.; Han, X. KAIROS: practical intrusion detection and investigation using whole-system provenance. In Proceedings of the 2024 IEEE Symposium on Security and Privacy (SP); IEEE, 2024; pp. 3533–3551. [Google Scholar]
- Hu, X.; Gao, W.; Cheng, G.; Li, R.; Zhou, Y.; Wu, H. Towards Early and Accurate Network Intrusion Detection Using Graph Embedding. IEEE Transactions on Information Forensics and Security 2023. [Google Scholar] [CrossRef]
- Yu, L.; Tao, J.; Xu, Y.; Sun, W.; Wang, Z. TLS fingerprint for encrypted malicious traffic detection with attributed graph kernel. Computer Networks 2024, 247, 110475. [Google Scholar] [CrossRef]
- Messai, M.L.; Seba, H. IoT Network Attack Detection: Leveraging Graph Learning for Enhanced Security. In Proceedings of the Proceedings of the 18th International Conference on Availability, Reliability and Security; 2023; pp. 1–7. [Google Scholar]
- Cai, L.; Chen, Z.; Luo, C.; Gui, J.; Ni, J.; Li, D.; Chen, H. Structural temporal graph neural networks for anomaly detection in dynamic graphs. In Proceedings of the Proceedings of the 30th ACM international conference on Information & Knowledge Management; 2021; pp. 3747–3756. [Google Scholar]
- Jiang, Y.; Liu, G. Two-stage anomaly detection algorithm via dynamic community evolution in temporal graph. Applied Intelligence 2022, 52, 12222–12240. [Google Scholar] [CrossRef]
- Zhang, Y.; Zhang, Y.; Wu, Y.; Li, C. TPE-NIDS: uses graph neural networks to detect malicious traffic. In Proceedings of the 2022 4th International Conference on Frontiers Technology of Information and Computer (ICFTIC); IEEE, 2022; pp. 949–958. [Google Scholar]
- Tsikerdekis, M.; Waldron, S.; Emanuelson, A. Network anomaly detection using exponential random graph models and autoregressive moving average. IEEE Access 2021, 9, 134530–134542. [Google Scholar] [CrossRef]
- Gao, M.; Wu, L.; Li, Q.; Chen, W. Anomaly traffic detection in IoT security using graph neural networks. Journal of Information Security and Applications 2023, 76, 103532. [Google Scholar] [CrossRef]
- Villegas-Ch, W.; Govea, J.; Navarro, A.M.; Játiva, P.P. Intrusion Detection in IoT Networks Using Dynamic Graph Modeling and Graph-Based Neural Networks. IEEE Access 2025. [Google Scholar] [CrossRef]
- Liu, X.; Ren, J.; He, H.; Zhang, B.; Song, C.; Wang, Y. A fast all-packets-based DDoS attack detection approach based on network graph and graph kernel. Journal of Network and Computer Applications 2021, 185, 103079. [Google Scholar] [CrossRef]
- Ghadermazi, J.; Hore, S.; Shah, A.; Bastian, N.D. GTAE-IDS: Graph Transformer-based Autoencoder Framework for Real-time Network Intrusion Detection. IEEE Transactions on Information Forensics and Security 2025. [Google Scholar] [CrossRef]
- Kong, X.; Song, Z.; Ye, X.; Jiao, J.; Qi, H.; Liu, X. GRID: Graph-Based Robust Intrusion Detection Solution for Industrial IoT Networks. IEEE Internet of Things Journal 2025. [Google Scholar] [CrossRef]
- Bajpai, S.; Krishna Murthy, P.; Kumar, N. AnomGraphAdv: Enhancing Anomaly and Network Intrusion Detection in Wireless Networks Using Adversarial Training and Temporal Graph Networks. In Proceedings of the Proceedings of the 17th ACM Conference on Security and Privacy in Wireless and Mobile Networks; 2024; pp. 113–122. [Google Scholar]
- Fu, C.; Li, Q.; Xu, K. Detecting unknown encrypted malicious traffic in real time via flow interaction graph analysis. arXiv 2023, arXiv:2301.13686. [Google Scholar] [CrossRef]
- Friji, H.; Olivereau, A.; Sarkiss, M. Efficient Network Representation for GNN-Based Intrusion Detection. In Proceedings of the International Conference on Applied Cryptography and Network Security; Springer, 2023; pp. 532–554. [Google Scholar]
- Lo, W.W.; Layeghy, S.; Sarhan, M.; Gallagher, M.; Portmann, M. E-graphsage: A graph neural network based intrusion detection system for iot. In Proceedings of the NOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium; IEEE, 2022; pp. 1–9. [Google Scholar]
- Caville, E.; Lo, W.W.; Layeghy, S.; Portmann, M. Anomal-E: A self-supervised network intrusion detection system based on graph neural networks. Knowledge-based systems 2022, 258, 110030. [Google Scholar] [CrossRef]
- Kaya, K.; Ak, E.; Bas, S.; Canberk, B.; Oguducu, S.G. X-CBA: Explainability Aided CatBoosted Anomal-E for Intrusion Detection System. arXiv 2024, arXiv:2402.00839. [Google Scholar]
- Mao, Q.; Lin, X.; Xu, W.; Qi, Y.; Su, X.; Li, G.; Li, J. FeCoGraph: Label-aware Federated Graph Contrastive Learning for Few-shot Network Intrusion Detection. IEEE Transactions on Information Forensics and Security 2025. [Google Scholar] [CrossRef]
- Duan, G.; Lv, H.; Wang, H.; Feng, G. Application of a dynamic line graph neural network for intrusion detection with semisupervised learning. IEEE Transactions on Information Forensics and Security 2022, 18, 699–714. [Google Scholar] [CrossRef]
- Li, Z.; Cheng, X.; Sun, L.; Zhang, J.; Chen, B. A hierarchical approach for advanced persistent threat detection with attention-based graph neural networks. Security and Communication Networks 2021, 2021, 1–14. [Google Scholar] [CrossRef]
- Wang, H.; Chen, Y.; Zhang, C.; Li, J.; Gan, C.; Zhang, Y.; Chen, X. GenGLAD: A Generated Graph Based Log Anomaly Detection Framework. In Proceedings of the International Conference on Smart Computing and Communication; Springer, 2022; pp. 11–22. [Google Scholar]
- Zhou, A.; Xu, X.; Raghunathan, R.; Lal, A.; Guan, X.; Yu, B.; Li, B. KnowGraph: Knowledge-Enabled Anomaly Detection via Logical Reasoning on Graph Data. In Proceedings of the Proceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security; 2024; pp. 168–182. [Google Scholar]
- Kisanga, P.; Woungang, I.; Traore, I.; Carvalho, G.H. Network Anomaly Detection Using a Graph Neural Network. In Proceedings of the 2023 International Conference on Computing, Networking and Communications (ICNC); IEEE, 2023; pp. 61–65. [Google Scholar]
- Yang, W.; Gao, P.; Huang, H.; Wei, X.; Liu, W.; Zhu, S.; Luo, W. RShield: A refined shield for complex multi-step attack detection based on temporal graph network. In Proceedings of the International Conference on Database Systems for Advanced Applications; Springer, 2022; pp. 468–480. [Google Scholar]
- Copstein, R.; Niblett, B.; Johnston, A.; Schwartzentruber, J.; Heywood, M.; Zincir-Heywood, N. Towards Anomaly Detection using Multiple Instances of Micro-Cluster Detection. In Proceedings of the 2023 7th Cyber Security in Networking Conference (CSNet); IEEE, 2023; pp. 185–191. [Google Scholar]
- Ye, M.; Men, S.; Xie, L.; Chen, B. Detect Advanced Persistent Threat In Graph-Level Using Competitive AutoEncoder. In Proceedings of the Proceedings of the 2023 2nd International Conference on Networks, Communications and Information Technology; 2023; pp. 28–34. [Google Scholar]
- Niu, W.; Yu, Z.; Li, Z.; Li, B.; Zhang, R.; Zhang, X. LogTracer: Efficient Anomaly Tracing Combining System Log Detection and Provenance Graph. In Proceedings of the GLOBECOM 2022-2022 IEEE Global Communications Conference; IEEE, 2022; pp. 3356–3361. [Google Scholar]
- Lakha, B.; Mount, S.L.; Serra, E.; Cuzzocrea, A. Anomaly Detection in Cybersecurity Events Through Graph Neural Network and Transformer Based Model: A Case Study with BETH Dataset. In Proceedings of the 2022 IEEE International Conference on Big Data (Big Data); IEEE, 2022; pp. 5756–5764. [Google Scholar]
- Rehman, M.U.; Ahmadi, H.; Hassan, W.U. Flash: A comprehensive approach to intrusion detection via provenance graph representation learning. In Proceedings of the 2024 IEEE Symposium on Security and Privacy (SP); IEEE, 2024; pp. 3552–3570. [Google Scholar]
- Jia, Z.; Xiong, Y.; Nan, Y.; Zhang, Y.; Zhao, J.; Wen, M. {MAGIC}: Detecting advanced persistent threats via masked graph representation learning. In Proceedings of the 33rd USENIX Security Symposium (USENIX Security 24); 2024; pp. 5197–5214. [Google Scholar]
- Pan, Y.; Cai, L.; Leng, T.; Zhao, L.; Ma, J.; Yu, A.; Meng, D. AttackMiner: A Graph Neural Network Based Approach for Attack Detection from Audit Logs. In Proceedings of the International Conference on Security and Privacy in Communication Systems; Springer, 2022; pp. 510–528. [Google Scholar]
- King, I.J.; Shu, X.; Jang, J.; Eykholt, K.; Lee, T.; Huang, H.H. EdgeTorrent: Real-time Temporal Graph Representations for Intrusion Detection. In Proceedings of the Proceedings of the 26th International Symposium on Research in Attacks, Intrusions and Defenses; 2023; pp. 77–91. [Google Scholar]
- Goyal, A.; Wang, G.; Bates, A. R-caid: Embedding root cause analysis within provenance-based intrusion detection. In Proceedings of the 2024 IEEE Symposium on Security and Privacy (SP); IEEE, 2024; pp. 3515–3532. [Google Scholar]
- Xu, L.; Zhao, Z.; Zhao, D.; Li, X.; Lu, X.; Yan, D. AJSAGE: A intrusion detection scheme based on Jump-Knowledge Connection To GraphSAGE. Computers & Security 2025, 150, 104263. [Google Scholar]
- Cao, Z.; Stephen Huang, S.H. Host-Based Intrusion Detection: A Behavioral Approach Using Graph Model. In Proceedings of the International Conference on Hybrid Intelligent Systems; Springer, 2022; pp. 1337–1346. [Google Scholar]
- Zhao, S.; Wei, R.; Cai, L.; Yu, A.; Meng, D. Ctlmd: Continuous-temporal lateral movement detection using graph embedding. In Proceedings of the Information and Communications Security: 21st International Conference, ICICS 2019, Beijing, China, December 15–17, 2019, Revised Selected Papers 21. Springer, 2020; pp. 181–196. [Google Scholar]
- Xiao, J.; Yang, L.; Zhong, F.; Wang, X.; Chen, H.; Li, D. Robust Anomaly-based Insider Threat Detection using Graph Neural Network. IEEE Transactions on Network and Service Management 2022. [Google Scholar] [CrossRef]
- Zhang, H.; Zeng, K.; Lin, S. Federated graph neural network for fast anomaly detection in controller area networks. IEEE Transactions on Information Forensics and Security 2023, 18, 1566–1579. [Google Scholar] [CrossRef]
- Linghu, Y.; Li, X. Wsg-inv: Weighted state graph model for intrusion detection on in-vehicle network. In Proceedings of the 2021 IEEE Wireless Communications and Networking Conference (WCNC); IEEE, 2021; pp. 1–7. [Google Scholar]
- Islam, R.; Refat, R.U.D.; Yerram, S.M.; Malik, H. Graph-based intrusion detection system for controller area networks. IEEE Transactions on Intelligent Transportation Systems 2020, 23, 1727–1736. [Google Scholar] [CrossRef]
- Huang, J.; Yang, Y.; Yu, H.; Li, J.; Zheng, X. Twin Graph-Based Anomaly Detection via Attentive Multi-Modal Learning for Microservice System. In Proceedings of the 2023 38th IEEE/ACM International Conference on Automated Software Engineering (ASE); IEEE, 2023; pp. 66–78. [Google Scholar]
- Liu, T.; Li, Z.; Long, H.; Bilal, A. Nt-gnn: Network traffic graph for 5g mobile iot android malware detection. Electronics 2023, 12, 789. [Google Scholar] [CrossRef]
- Huang, Z.; Gu, Y.; Zhao, Q. One-Class Directed Heterogeneous Graph Neural Network for Intrusion Detection. In Proceedings of the 2022 the 6th International Conference on Innovation in Artificial Intelligence (ICIAI); 2022; pp. 178–184. [Google Scholar]
- Meng, L.; Xi, R.; Li, Z.; Zhu, H. PG-AID: An Anomaly-based Intrusion Detection Method Using Provenance Graph. In Proceedings of the 2024 27th International Conference on Computer Supported Cooperative Work in Design (CSCWD); IEEE, 2024; pp. 2522–2527. [Google Scholar]
- Pennington, J.; Socher, R.; Manning, C.D. Glove: Global vectors for word representation. In Proceedings of the Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP); 2014; pp. 1532–1543. [Google Scholar]
- Meng, Y.; Li, J.; Liu, F.; Li, S.; Hu, H.; Zhu, H. GB-IDS: An Intrusion Detection System for CAN Bus Based on Graph Analysis. In Proceedings of the 2023 IEEE/CIC International Conference on Communications in China (ICCC); IEEE, 2023; pp. 1–6. [Google Scholar]
- Chen, J.; Liu, F.; Jiang, J.; Zhong, G.; Xu, D.; Tan, Z.; Shi, S. TraceGra: A trace-based anomaly detection for microservice using graph deep learning. Computer Communications 2023, 204, 109–117. [Google Scholar] [CrossRef]
- Chen, L.; Dang, Q.; Chen, M.; Sun, B.; Du, C.; Lu, Z. BertHTLG: Graph-Based Microservice Anomaly Detection Through Sentence-Bert Enhancement. In Proceedings of the International Conference on Web Information Systems and Applications; Springer, 2023; pp. 427–439. [Google Scholar]
- Wang, C.; Zhu, H. Wrongdoing monitor: a graph-based behavioral anomaly detection in cyber Security. IEEE Transactions on Information Forensics and Security 2022, 17, 2703–2718. [Google Scholar] [CrossRef]
- Deng, D. DBSCAN clustering algorithm based on density. In Proceedings of the 2020 7th international forum on electrical engineering and automation (IFEEA); IEEE, 2020; pp. 949–953. [Google Scholar]
- Paudel, R.; Eberle, W. Snapsketch: Graph representation approach for intrusion detection in a streaming graph. In Proceedings of the Proceedings of the 16th International Workshop on Mining and Learning with Graphs (MLG); 2020. [Google Scholar]
- Shi, H.; Li, H.; Zhang, D.; Cheng, C.; Cao, X. An efficient feature generation approach based on deep learning and feature selection techniques for traffic classification. Computer Networks 2018, 132, 81–98. [Google Scholar] [CrossRef]
- Eswaran, D.; Faloutsos, C.; Guha, S.; Mishra, N. Spotlight: Detecting anomalies in streaming graphs. In Proceedings of the Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining; 2018; pp. 1378–1386. [Google Scholar]
- Lamichhane, P.B.; Eberle, W. Anomaly detection in edge streams using term frequency-inverse graph frequency (tf-igf) concept. In Proceedings of the 2021 IEEE international conference on big data (Big Data); IEEE, 2021; pp. 661–667. [Google Scholar]
- Cai, H.; Zheng, V.W.; Chang, K.C.C. A comprehensive survey of graph embedding: Problems, techniques, and applications. IEEE transactions on knowledge and data engineering 2018, 30, 1616–1637. [Google Scholar] [CrossRef]
- Zhou, J.; Cui, G.; Hu, S.; Zhang, Z.; Yang, C.; Liu, Z.; Wang, L.; Li, C.; Sun, M. Graph neural networks: A review of methods and applications. AI open 2020, 1, 57–81. [Google Scholar] [CrossRef]
- Zaccagnino, R.; Cirillo, A.; Guarino, A.; Lettieri, N.; Malandrino, D.; Zaccagnino, G. Towards a Geometric Deep Learning-Based Cyber Security: Network System Intrusion Detection Using Graph Neural Networks. In Proceedings of the SECRYPT; 2023; pp. 394–401. [Google Scholar]
- Gao, M.; Chen, L.; He, X.; Zhou, A. Bine: Bipartite network embedding. In Proceedings of the The 41st international ACM SIGIR conference on research & development in information retrieval; 2018; pp. 715–724. [Google Scholar]
- Khemani, B.; Patil, S.; Kotecha, K.; Tanwar, S. A review of graph neural networks: concepts, architectures, techniques, challenges, datasets, applications, and future directions. Journal of Big Data 2024, 11, 18. [Google Scholar] [CrossRef]
- Gilmer, J.; Schoenholz, S.S.; Riley, P.F.; Vinyals, O.; Dahl, G.E. Neural message passing for quantum chemistry. In Proceedings of the International conference on machine learning; PMLR, 2017; pp. 1263–1272. [Google Scholar]
- Mohan, A.; Pramod, K. Temporal network embedding using graph attention network. Complex & Intelligent Systems 2022, 8, 13–27. [Google Scholar]
- Hamilton, W.; Ying, Z.; Leskovec, J. Inductive representation learning on large graphs. Advances in neural information processing systems 2017, 30. [Google Scholar]
- Feng, M.H.; Hsu, C.C.; Li, C.T.; Yeh, M.Y.; Lin, S.D. Marine: Multi-relational network embeddings with relational proximity and node attributes. In Proceedings of the The World Wide Web Conference; 2019; pp. 470–479. [Google Scholar]
- Deng, A.; Hooi, B. Graph neural network-based anomaly detection in multivariate time series. Proceedings of the Proceedings of the AAAI conference on artificial intelligence 2021, 35, 4027–4035. [Google Scholar] [CrossRef]
- Purnama, S.R.; Istiyanto, J.E.; Amrizal, M.A.; Handika, V.; Rochman, S.; Dharmawan, A. Inductive Graph Neural Network with Causal Sampling for IoT Network Intrusion Detection System. In Proceedings of the 2022 International Conference on Computer Engineering, Network, and Intelligent Multimedia (CENIM); IEEE, 2022; pp. 241–246. [Google Scholar]
- Narayanan, A.; Chandramohan, M.; Venkatesan, R.; Chen, L.; Liu, Y.; Jaiswal, S. graph2vec: Learning distributed representations of graphs. arXiv 2017, arXiv:1707.05005. [Google Scholar] [CrossRef]
- Sun, F.Y.; Hoffmann, J.; Verma, V.; Tang, J. Infograph: Unsupervised and semi-supervised graph-level representation learning via mutual information maximization. arXiv 2019, arXiv:1908.01000. [Google Scholar]
- Zohrevand, Z.; Glässer, U. Should i raise the red flag? A comprehensive survey of anomaly scoring methods toward mitigating false alarms. arXiv 2019, arXiv:1904.06646. [Google Scholar]
- Eswaran, D.; Faloutsos, C. Sedanspot: Detecting anomalies in edge streams. In Proceedings of the 2018 IEEE International conference on data mining (ICDM); IEEE, 2018; pp. 953–958. [Google Scholar]
- Bu, Z.; Ye, S. Comparison of Different Partition Clustering Algorithms under the Network Flow Detection Scenario. In Proceedings of the 2024 8th International Conference on Communication and Information Systems (ICCIS); IEEE, 2024; pp. 140–144. [Google Scholar]
- Bhattarai, B.; Huang, H.H. Prov2vec: Learning provenance graph representation for anomaly detection in computer systems. In Proceedings of the Proceedings of the 19th International Conference on Availability, Reliability and Security; 2024; pp. 1–14. [Google Scholar]
- Ersoy, P. Evolution of Outlier Algorithms for Anomaly Detection. Manchester Journal of Artificial Intelligence and Applied Sciences 2021, 2. [Google Scholar]
- Gogoi, P.; Bhattacharyya, D.K.; Borah, B.; Kalita, J.K. A survey of outlier detection methods in network anomaly identification. The Computer Journal 2011, 54, 570–588. [Google Scholar] [CrossRef]
- Ma, X.; Wu, J.; Xue, S.; Yang, J.; Zhou, C.; Sheng, Q.Z.; Xiong, H.; Akoglu, L. A comprehensive survey on graph anomaly detection with deep learning. IEEE Transactions on Knowledge and Data Engineering 2021. [Google Scholar] [CrossRef]
- Chen, Z.; Yeo, C.K.; Lee, B.S.; Lau, C.T. Autoencoder-based network anomaly detection. In Proceedings of the 2018 Wireless telecommunications symposium (WTS); IEEE, 2018; pp. 1–5. [Google Scholar]
- Samariya, D.; Thakkar, A. A comprehensive survey of anomaly detection algorithms. Annals of Data Science 2023, 10, 829–850. [Google Scholar] [CrossRef]
- Thomas, J.M.; Moallemy-Oureh, A.; Beddar-Wiesing, S.; Holzhüter, C. Graph neural networks designed for different graph types: A survey. arXiv 2022, arXiv:2204.03080. [Google Scholar]
- Ruff, L.; Vandermeulen, R.; Goernitz, N.; Deecke, L.; Siddiqui, S.A.; Binder, A.; Müller, E.; Kloft, M. Deep one-class classification. In Proceedings of the International conference on machine learning; PMLR, 2018; pp. 4393–4402. [Google Scholar]
- Xu, K.; Li, C.; Tian, Y.; Sonobe, T.; Kawarabayashi, K.i.; Jegelka, S. Representation learning on graphs with jumping knowledge networks. In Proceedings of the International conference on machine learning; PMLR, 2018; pp. 5453–5462. [Google Scholar]
- Bui, K.H.N.; Cho, J.; Yi, H. Spatial-temporal graph neural network for traffic forecasting: An overview and open research issues. Applied Intelligence 2022, 52, 2763–2774. [Google Scholar] [CrossRef]
- Rossi, E.; Chamberlain, B.; Frasca, F.; Eynard, D.; Monti, F.; Bronstein, M. Temporal graph networks for deep learning on dynamic graphs. arXiv 2020, arXiv:2006.10637. [Google Scholar] [CrossRef]
- Bhatia, S.; Liu, R.; Hooi, B.; Yoon, M.; Shin, K.; Faloutsos, C. Real-time anomaly detection in edge streams. ACM Transactions on Knowledge Discovery from Data (TKDD) 2022, 16, 1–22. [Google Scholar] [CrossRef]
- Sharafaldin, I.; Lashkari, A.H.; Ghorbani, A.A.; et al. Toward generating a new intrusion detection dataset and intrusion traffic characterization. ICISSp 2018, 1, 108–116. [Google Scholar]
- Moustafa, N.; Slay, J. UNSW-NB15: a comprehensive data set for network intrusion detection systems (UNSW-NB15 network data set). In Proceedings of the 2015 military communications and information systems conference (MilCIS); IEEE, 2015; pp. 1–6. [Google Scholar]
- Sahoo, K.S.; Puthal, D.; Tiwary, M.; Rodrigues, J.J.; Sahoo, B.; Dash, R. An early detection of low rate DDoS attack to SDN based data center networks using information distance metrics. Future Generation Computer Systems 2018, 89, 685–697. [Google Scholar] [CrossRef]
- Garcia, S.; Grill, M.; Stiborek, J.; Zunino, A. An empirical comparison of botnet detection methods. computers & security 2014, 45, 100–123. [Google Scholar]
- Song, J.; Takakura, H.; Okabe, Y.; Eto, M.; Inoue, D.; Nakao, K. Statistical analysis of honeypot data and building of Kyoto 2006+ dataset for NIDS evaluation. In Proceedings of the Proceedings of the first workshop on building analysis datasets and gathering experience returns for security; 2011; pp. 29–36. [Google Scholar]
- Lashkari, A.H.; Gil, G.D.; Mamun, M.S.I.; Ghorbani, A.A. Characterization of tor traffic using time based features. In Proceedings of the International Conference on Information Systems Security and Privacy; SciTePress, 2017; Vol. 2, pp. 253–262. [Google Scholar]
- Lukaseder, T. 2017-SUEE-data-set, 2017.
- Taheri, L.; Kadir, A.F.A.; Lashkari, A.H. Extensible android malware detection and family classification using network-flows and API-calls. In Proceedings of the 2019 international Carnahan conference on security technology (ICCST); IEEE, 2019; pp. 1–8. [Google Scholar]
- Kolias, C.; Kambourakis, G.; Stavrou, A.; Gritzalis, S. Intrusion detection in 802.11 networks: Empirical evaluation of threats and a public dataset. IEEE Communications Surveys & Tutorials 2015, 18, 184–208. [Google Scholar] [CrossRef]
- Glasser, J.; Lindauer, B. Bridging the gap: A pragmatic approach to generating insider threat data. In Proceedings of the 2013 IEEE Security and Privacy Workshops. IEEE; 2013; pp. 98–104. [Google Scholar]
- Koroniotis, N.; Moustafa, N.; Sitnikova, E.; Turnbull, B. Towards the development of realistic botnet dataset in the internet of things for network forensic analytics: Bot-iot dataset. Future Generation Computer Systems 2019, 100, 779–796. [Google Scholar] [CrossRef]
- Sarhan, M.; Layeghy, S.; Moustafa, N.; Portmann, M. Netflow datasets for machine learning-based network intrusion detection systems. In Proceedings of the Big Data Technologies and Applications: 10th EAI International Conference, BDTA 2020, and 13th EAI International Conference on Wireless Internet, WiCON 2020, Virtual Event, December 11, 2020, Proceedings 10; Springer, 2021; pp. 117–135. [Google Scholar]
- Saad, S.; Traore, I.; Ghorbani, A.; Sayed, B.; Zhao, D.; Lu, W.; Felix, J.; Hakimian, P. Detecting P2P botnets through network behavior analysis and machine learning. In Proceedings of the 2011 Ninth annual international conference on privacy, security and trust; IEEE, 2011; pp. 174–180. [Google Scholar]
- Nack, E.A.; McKenzie, M.C.; Bastian, N.D. ACI-IoT-2023: A Robust Dataset for Internet of Things Network Security Analysis. In Proceedings of the MILCOM 2024-2024 IEEE Military Communications Conference (MILCOM); IEEE, 2024; pp. 1–6. [Google Scholar]
- Neto, E.C.P.; Dadkhah, S.; Ferreira, R.; Zohourian, A.; Lu, R.; Ghorbani, A.A. CICIoT2023: A real-time dataset and benchmark for large-scale attacks in IoT environment. Sensors 2023, 23, 5941. [Google Scholar] [CrossRef]
- Song, H.M.; Kim, H.K. Discovering can specification using on-board diagnostics. IEEE Design & Test 2020, 38, 93–103. [Google Scholar] [CrossRef]
- Lee, H.; Jeong, S.H.; Kim, H.K. OTIDS: A novel intrusion detection system for in-vehicle network by using remote frame. In Proceedings of the 2017 15th Annual Conference on Privacy, Security and Trust (PST); IEEE, 2017; pp. 57–5709. [Google Scholar]
- Camina, J.B.; Hernández-Gracidas, C.; Monroy, R.; Trejo, L. The Windows-Users and-Intruder simulations Logs dataset (WUIL): An experimental framework for masquerade detection mechanisms. Expert Systems with Applications 2014, 41, 919–930. [Google Scholar] [CrossRef]
- Alsaheel, A.; Nan, Y.; Ma, S.; Yu, L.; Walkup, G.; Celik, Z.B.; Zhang, X.; Xu, D. {ATLAS}: A sequence-based learning approach for attack investigation. In Proceedings of the 30th USENIX Security Symposium (USENIX Security 21); 2021; pp. 3005–3022. [Google Scholar]
- Manzoor, E.; Milajerdi, S.M.; Akoglu, L. Fast memory-efficient anomaly detection in streaming heterogeneous graphs. In Proceedings of the Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining; 2016; pp. 1035–1044. [Google Scholar]
- Turcotte, M.J.; Kent, A.D.; Hash, C. Unified host and network data set. In Data science for cyber-security; World Scientific, 2019; pp. 1–22. [Google Scholar]
- Zhang, C.; Peng, X.; Sha, C.; Zhang, K.; Fu, Z.; Wu, X.; Lin, Q.; Zhang, D. Deeptralog: Trace-log combined microservice anomaly detection through graph-based deep learning. In Proceedings of the Proceedings of the 44th International Conference on Software Engineering; 2022; pp. 623–634. [Google Scholar]
- Highnam, K.; Arulkumaran, K.; Hanif, Z.; Jennings, N.R. Beth dataset: Real cybersecurity data for unsupervised anomaly detection research. In Proceedings of the CEUR Workshop Proc; 2021; Vol. 3095, pp. 1–12. [Google Scholar]
- Nedelkoski, S.; Bogatinovski, J.; Mandapati, A.K.; Becker, S.; Cardoso, J.; Kao, O. Multi-source distributed system data for ai-powered analytics. In Proceedings of the Service-Oriented and Cloud Computing: 8th IFIP WG 2.14 European Conference, ESOCC 2020, Heraklion, Crete, Greece, September 28–30, 2020, Proceedings 8; Springer, 2020; pp. 161–176. [Google Scholar]
- Li, Z.; Chen, J.; Jiao, R.; Zhao, N.; Wang, Z.; Zhang, S.; Wu, Y.; Jiang, L.; Yan, L.; Wang, Z.; et al. Practical root cause localization for microservice systems via trace analysis. In Proceedings of the 2021 IEEE/ACM 29th International Symposium on Quality of Service (IWQOS); IEEE, 2021; pp. 1–10. [Google Scholar]
- Kovács, G.; Sebestyen, G.; Hangan, A. Evaluation metrics for anomaly detection algorithms in time-series. Acta Universitatis Sapientiae, Informatica 2019, 11, 113–130. [Google Scholar] [CrossRef]
- Sørbø, S.; Ruocco, M. Navigating the metric maze: a taxonomy of evaluation metrics for anomaly detection in time series. Data Mining and Knowledge Discovery 2024, 38, 1027–1068. [Google Scholar] [CrossRef]
- Sauka, K.; Shin, G.Y.; Kim, D.W.; Han, M.M. Adversarial robust and explainable network intrusion detection systems based on deep learning. Applied Sciences 2022, 12, 6451. [Google Scholar] [CrossRef]
- Liu, Y.; Ding, K.; Lu, Q.; Li, F.; Zhang, L.Y.; Pan, S. Towards Self-Interpretable Graph-Level Anomaly Detection. arXiv 2023, arXiv:2310.16520. [Google Scholar]
- Neupane, S.; Ables, J.; Anderson, W.; Mittal, S.; Rahimi, S.; Banicescu, I.; Seale, M. Explainable intrusion detection systems (x-ids): A survey of current methods, challenges, and opportunities. IEEE Access 2022, 10, 112392–112415. [Google Scholar] [CrossRef]
- Ying, Z.; Bourgeois, D.; You, J.; Zitnik, M.; Leskovec, J. Gnnexplainer: Generating explanations for graph neural networks. Advances in neural information processing systems 2019, 32. [Google Scholar]
- Holdijk, L.; Boon, M.; Henckens, S.; de Jong, L. [Re] parameterized explainer for graph neural network. In Proceedings of the ML Reproducibility Challenge 2020; 2021. [Google Scholar]
- Wang, H.; Liu, T.; Sheng, Z.; Li, H. Explanatory subgraph attacks against Graph Neural Networks. Neural Networks 2024, 172, 106097. [Google Scholar] [CrossRef] [PubMed]
- Herath, J.D.; Wakodikar, P.P.; Yang, P.; Yan, G. Cfgexplainer: Explaining graph neural network-based malware classification from control flow graphs. In Proceedings of the 2022 52nd Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN); IEEE, 2022; pp. 172–184. [Google Scholar]
- Mukherjee, K.; Wiedemeier, J.; Wang, T.; Kim, M.; Chen, F.; Kantarcioglu, M.; Jee, K. Interpreting gnn-based ids detections using provenance graph structural features. arXiv 2023, arXiv:2306.00934. [Google Scholar] [CrossRef]
- Baahmed, A.R.E.M.; Andresini, G.; Robardet, C.; Appice, A. Using graph neural networks for the detection and explanation of network intrusions. In Proceedings of the International Workshop on eXplainable Knowledge Discovery in Data Mining; 2023. [Google Scholar]
- Sharma, J.; Giri, C.; Granmo, O.C.; Goodwin, M. Multi-layer intrusion detection system with ExtraTrees feature selection, extreme learning machine ensemble, and softmax aggregation. EURASIP Journal on Information Security 2019, 2019, 1–16. [Google Scholar] [CrossRef]
- Wang, Y.; Liu, Z.; Zheng, W.; Wang, J.; Shi, H.; Gu, M. A Combined Multi-Classification Network Intrusion Detection System Based on Feature Selection and Neural Network Improvement. Applied Sciences 2023, 13, 8307. [Google Scholar] [CrossRef]
- Barnard, P.; Dasilva, L.A.; Marchetti, N. Don’t Just Explain, Enhance! Using Explainable Artificial Intelligence (XAI) to Automatically Improve Network Intrusion Detection. Authorea Preprints 2024. [Google Scholar]
- Bhatia, S.; Wadhwa, M.; Kawaguchi, K.; Shah, N.; Yu, P.S.; Hooi, B. Sketch-based anomaly detection in streaming graphs. In Proceedings of the Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining; 2023; pp. 93–104. [Google Scholar]
- Liu, H.; Lang, B. Machine learning and deep learning methods for intrusion detection systems: A survey. applied sciences 2019, 9, 4396. [Google Scholar] [CrossRef]
- Andreas, B.; Dilruksha, J.; McCandless, E. Flow-based and packet-based intrusion detection using BLSTM. SMU Data Science Review 2020, 3, 8. [Google Scholar]
- Ring, M.; Wunderlich, S.; Scheuring, D.; Landes, D.; Hotho, A. A survey of network-based intrusion detection data sets. Computers & Security 2019, 86, 147–167. [Google Scholar] [CrossRef]
- Liu, L.; De Vel, O.; Han, Q.L.; Zhang, J.; Xiang, Y. Detecting and preventing cyber insider threats: A survey. IEEE Communications Surveys & Tutorials 2018, 20, 1397–1417. [Google Scholar] [CrossRef]
- Marchetti, M.; Stabili, D. READ: Reverse engineering of automotive data frames. IEEE Transactions on Information Forensics and Security 2018, 14, 1083–1097. [Google Scholar] [CrossRef]
- Zipperle, M.; Gottwalt, F.; Chang, E.; Dillon, T. Provenance-based intrusion detection systems: A survey. ACM Computing Surveys 2022, 55, 1–36. [Google Scholar] [CrossRef]



| Surveys | Year | Type of Data | Focus | Application Area | Graph-based Methods | Anomaly-based | Intrusion Detection Focus |
|---|---|---|---|---|---|---|---|
| [21] | 2020 | MD | Graph-based Anomaly Detection | Fraud Detection | ✓ | ✓ | ✗ |
| [22] | 2021 | H | Provenance Graph-based Detection | Threat Detection | ✓ | ◗ | ✗ |
| [23] | 2022 | N/A | GNN-based Anomaly Detection | Various Domains | ◗ | ✓ | ✗ |
| [24] | 2022 | N/A | Graph-based Deep Learning for Anomaly Detection | Distributed Systems | ✓ | ✓ | ✗ |
| [25] | 2023 | N, H | Graph-based Representation and Analytics | Botnet Detection | ✓ | ✓ | ✗ |
| [26] | 2023 | N, H | GNN-based Anomaly Detection | Anomaly Detection/ Intrusion Detection | ✓ | ✓ | ✓ |
| [27] | 2024 | N, H | GNN-based Anomaly Detection | Intrusion Detection | ✓ | ✓ | ✓ |
| Ours | 2025 | N, H, MM | Graph Representation Learning (GRL)-based Methods (Two-stage) & Graph-based End-to-end Methods | Anomaly Detection & Attack/Intrusion Detection | ✓ | ✓ | ✓ |
| Graph | Node | Edges | Related Works |
|---|---|---|---|
| Network traffic graph - packet data | IP addresses and ports/network devices/hosts/data-packets/ feature of data packet with attributes of feature value [HT] | Information transmission between nodes with attributes such as # of packets or bits / # of network requests, timestamps, protocol features, domain name features [D/UD, HP] | Static graphs: [8,55,58,65,66,67,68] Dynamic graphs [ST, S]: [57,62,68,69,70,71,72] |
| Network flow graph - flow data | IP addresses with flow node feature attributes/packets with attributes such as packet length, direction/each flow as node | Traffic flow between IP addresses/connection between nodes with attributes such as similarity score of flows and flow information such as protocol, flow duration, incoming bytes, bytes per packet, Transmission Control Protocol (TCP) flags [D/UD] | Static graphs: [60,73,74,75,76,77,78] Dynamic graphs [S]: [79] |
| Network logs graph | Network events with associated attributes/hosts with attributes such as IDs, label/logline with attributes such as src and dest entity behaviors, type of behavior, time occurred/network assets | Semantic edges/network events among hosts/associations between loglines/semantic events or operations between entities [D/UD, HT, I] | Static graphs: [19,51,80,81,82] Dynamic graphs [S]: [83,84,85] |
| Provenance graph | System entities, kernel objects with attributes such as type of node [HT/HM] | System events/syscalls between entities with attributes such as edge type, timestamp [D] | Static graphs: [7,46,59,80,86,87,88,89,90] Dynamic graphs: [49,52,91,92,93,94] |
| Audit data graph | File identifiers with degree sum as attributes/login entities/users and hosts/user activity items with timestamp attribute | File transitions/login actions between entities/relationship between activity logs with timestamp attribute [D/UD, HM, HT, B] | Static graphs: [95,96] Dynamic graphs: [97] |
| Host logs graph | Log event labels with attributes formed with semantic embedding of log event | Event flows with attributes such as weight to indicate # of times events flow [D] | Static graphs: [37] |
| Control area network graphs | CAN ID/Arbitration ID with attributes of data content in CAN msg | Connect nodes based on sequence with attributes: frequency of CAN ID pair, vectorized weight [D] | Dynamic graphs: [98,99,100] |
| Multi-modal data graph | Service instances with attributes: concatenation of metric and log features | Scheduling relationship between service instances | Static graphs: [101] |
| Technique | Algorithms | Description | Related Works |
|---|---|---|---|
| Data Reduction | Pre-clustering | Pre-cluster the graph by components using high level statistics and choose cluster center. | [73] |
| Edge collation | Aggregating information from multiple edges in a graph and form a collated network. | [51,90] | |
| Sampling | Selecting a representative subset of nodes and/or edges from a large graph. | [20,89] | |
| Hoffman-based Data Adjustment | Reducing graph size by merging similar feature values and applying lossless compression. | [67] | |
| Data Transformation | Generate histogram | Build in-memory histogram runtime from streaming provenance graph. | [49] |
| Transform to line graph | Transformed to a line graph representation by changing nodes into edges vice-versa. | [74,78] | |
| Adaptive graph augmentation | Generates two structurally perturbed views to create positive and negative pairs for contrastive learning. | [74,78] | |
| Feature Extraction (FE) & Feature Optimization | Structural FE | Extract intrinsic properties of nodes and edges within the graph. | [58,100] |
| Extract behavioral features from the graph structure. | [95] | ||
| Path mining | Uses different strategies to extract meaningful paths from graphs, including DFS-based traversal, causal path selection, random walk exploration, and meta-path extraction using TF-IDS scoring. | [7,51,57,104] | |
| Sequence extraction | Extract node sequences, shingles using random walk. | [46,57,81] | |
| Graph pooling | Sort the nodes/edges of subgraph by importance score and select only the top K nodes/edges. | [63] | |
| Page Ranking | Feature generation by calculating the priority of each vertex based on its edges. | [106] | |
| Encoding | One hot encoding | Encode graph attributes and structure based on categorical features. | [19,51] |
| Word2vec | Encode sentences and phrases in graph attributes to a vector format. | [51,81,89] | |
| Hierarchical feature hashing | Encodes the node’s attribute multiple times in different levels. | [59] | |
| Vectorization | Encode causal and contextual data to row dimensional vectors, encode neighbourhood information based on poisson probability distribution. | [46] | |
| Edge feature vectorized by extracting TCP/IP layers’ bytes. | [70] | ||
| Graph Sketching | Representing states of graph using hash function to generate compact graph sketches. | [49,57,92] | |
| Spatial Temporal Node Encoding | Relative time encoding, diffusion based and distance based spatial encoding to create node embeddings by encoding global and local structure of nodes. | [20] |
| Embedding Method | Algorithm | Embedded Features | Key Operations | Related Works |
|---|---|---|---|---|
| Lookup Embedding | N/A | N, E | Assigns each unique node or edge label a fixed d-dimensional vector through direct indexing in a predefined embedding matrix. | [90] |
| Distribution-based | N/A | N | Generate vector representation by minimizing the KL-divergence distance between conditional distribution and empirical distribution of nodes. | [8] |
| Random Walk-based | BiNE | N | Map two types of nodes into d-dimensional vectors. | [96] |
| CTDNE | NT | Learns node embedding using random walks while capturing timing information | [96] | |
| doc2vec and TF-IDF | G | Form sentences for graph paths using nodes and edges and then, translate the sequence of words using PV-DM model of doc2vec to convert paths to a numeric vector | [7,104] | |
| graph2vec | G | Learning whole graph representation considering set of rooted sub graphs | [52,60] | |
| Interval inclined random walk | ET | Spatial features including edge features and temporal features of a graph stream embedded as vectors | [55] | |
| GNN-based | GCN | N, E, ET, G | Learn embedding of each node and aggregates embeddings from all its neighbours | [63,67,79,86] |
| GraphSAGE | N, E | Iteratively aggregates neighbouring node information at k-hop depth and sampled to generate node/edge embeddings | [62,65,75,76,77,89] | |
| GAT | N | Aggregating both local and root neighborhood information through weighted summation | [93] | |
| Attention-based | N | Generates embeddings by weighting and aggregating graph features using attention mechanisms | [80] | |
| Multi perspective | N, G | Generated node embeddings with GNN and send those through mean pooling layer to get graph embeddings | [103] | |
| Streaming Implementation | N | Graph sketching with MPNN to embed streaming graph data | [92] | |
| Graph Autoencoder-based | GAT layers for encoder and decoder | N | Learn node embeddings by encoding and reconstructing graph features. | [90] |
| GNN-based encoder and reconstruction decoder | N | Combines message-passing encoders with a decoder based on Neighborhood Wasserstein Reconstruction to capture and reconstruct both structural and feature-based neighborhood information. | [88] | |
| Spatial-nonspatial Embedding | MLP, GCN, GAT | N | Non-spatial data embed to lower space with MLP and spatial information with GCN and GAT | [74] |
| Advanced Embedding Methods | Graph Structure Learning | N | Learn the relationships between characteristic dimension | [12] |
| Event-property composite model | N | A novel NRL learning event- and property-level representations in a property graph using MARINE and GNN models. | [109] | |
| Infograph | G | Learn graph embedding by maximizing mutual information between normal paths and normal network activity patterns | [51] |
| AD Model | Algorithm | Supervision | Real-time | Inputs | Type of Graph Anomaly | Related Works |
|---|---|---|---|---|---|---|
| ML Classification | Random Forest | US S S |
✗ ◗ ✓ |
Node embeddings Subgraph embeddings Structural and temporal embeddings |
Node Subgraph Node |
[8] [60] [55] |
| Isolation Forest | US | ✗ | Encoded features, Node embeddings | Node | [88] | |
| XGBoost | SS | ◗ | Encoded features, Node embeddings | Node | [89] | |
| OCSVM and Isolation Forest | US | ✗ | Behavioral features | Node | [95] | |
| Catboost Classifier | S | ✗ | Edge embeddings | Edge | [77] | |
| Logistic Regression | S | ✓ | Structural and temporal embeddings | Path | [96] | |
| Clustering | K-Means | SfS | ✗ | Node embeddings | Node | [93] |
| K-medoid | US | ✓ | Graph sketches | Subgraph | [49] | |
| Customized distance based | US | ✗ | Graph embeddings | Node | [81] | |
| Outlier Detection | K-nearest neighbours (KNN) | US | ✗ | Node embeddings | Node | [90] |
| Copula based Outlier Detector (COPOD) | US | ✗ | Node embeddings | Path | [51] | |
| Robust Random Cut Forests (RRCF) | US | ✓ | Graph sketch vector | Node | [57] | |
| Local Outlier Factor (LOF) | US | ✗ | Path embeddings | Path | [7] | |
| Deep Learning | Neural Networks: | |||||
| - Recurrent CNN | US | ◗ | Subgraph embeddings | Subgraph | [52] | |
| - LSTM + BNN | S | ✗ | Vectorized graph data | Subgraph | [46] | |
| - MLP | S | ✓ | Node embeddings | Node | [62] | |
| - Dynamic GNN | SS | ✓ | Spatial & temporal embeddings | Node | [79] | |
| Generative Adversarial Network (GAN) | US US |
✓ ✓ |
Node embeddings Edge encodings |
Subgraph Edge |
[92] [20] |
|
| Transformer-based Networks | S | ✓ | Subgraph embeddings | Subgraph, Path | [104] | |
| Autoencoder | S US US US SS |
✗ ✗ ✗ ✗ ✓ |
Learned relations Encoded graph data Structural features Graph embeddings Generated features |
Node Edge Node Graph Node |
[12] [19] [58] [86] [106] |
|
| Scoring-based Methods | TF-IGF Approximation | US | ◗ | Encoded graph features | Edge | [114] |
| SEDANSCORER | US | ◗ | Sampled Subgraphs | Edge | [129] | |
| Ensemble Methods | LightGBM Classification + Clustering | S | ✗ | Node embeddings | Node | [67] |
| Hybrid Methods | Autoencoder + Negative Sampling | US | ✗ | Node & graph embeddings | Node, Edge | [80] |
| Deep SVDD | US | ✓ | Node & graph embeddings | Subgraph | [103] |
| Anomaly Detection Model | Model Architecture | Description | Inputs | Supervision | Real-time | Type of Graph Anomaly | Related Works |
|---|---|---|---|---|---|---|---|
| GNN | GCN | Aggregate features from neighboring nodes to update representations | Vectorized Node & Edge Attributes | S | ✓ | Node | [68,83] |
| GCN with One Class Classifier | GCN is used to learn graph representations, followed by one-class classifiers (e.g., SVM, SVDD) to detect anomalies | CAN Graph Multi-modal Data Graph Host Logs Graph |
S S US |
✓ ✓ ✗ |
Node Node Node |
[98] [108] [37] |
|
| GraphSAGE-based | Learn representation of nodes/edges by sampling a fixed size neighborhood of each and aggregate information from those neighbors | Network Flow Graph Network Traffic Graph Provenance Graph |
S S S |
✗ ✗ ✗ |
Edge Edge Node | [75] [65] [94] |
|
| GAT-based | Learn the weights of each node in neighbors during message passing | Vectorized Nodes Line Graph Network Traffic Graph |
S S S |
◗ ✗ ◗ |
Node Node Node |
[91] [74] [71] |
|
| GCN combined with GAT | Two GCN layer for learning time features and distance features followed by GAT layer to capture the importance of activity logs | Audit Data Graph | S | ✗ | Node | [97] | |
| GCN with Graph Contrastive Learning | GCN-based encoder learns node embeddings using contrastive and classification losses, with federated training | Network Flow Graph | SS | ✗ | Node | [78] | |
| Knowledge-enabled GNN | Integrate structured domain knowledge into the GNN learning process | Network Logs Graph | S | ✗ | Edge | [82] | |
| GNN with Temporal Models (GRU, TGN, LSTM) | Enabling dynamic representation learning and anomaly detection by processing time-varying features and updating node or edge embeddings across timestamps | Dynamic Graph Dynamic Graph Dynamic Graph Multi-modal Data Graph |
S S S US |
✗ ✓ ✗ ✓ |
Edge Edge Node Node |
[63] [84] [59] [107] |
|
| Clustering | Vertex Cover Optimization | Pre-cluster edges to detect critical vertices and identify abnormal edges through Z3 SMT solver and clustering loss analysis | Network Flow Graph | US | ✓ | Edge | [73] |
| Micro-clustering-based | Running parallel instances of micro-cluster detection with different attributes and anomaly score for each calculated using the MIDAS algorithm | Dynamic Graph | US | ✓ | Subgraph | [85] | |
| Evolutionary Graph Clustering | Extract community evolution events and use those results for detecting anomalous evolutionary paths | Dynamic Graph | US | ✗ | Event | [64] | |
| Graph Divergence Analysis | Graph Kernel-based | Calculate graph divergence using WL Kernel and detect using a dynamic threshold (Improved EWMA) | Network Traffic Graph | US | ✓ | Graph | [69] |
| Graph Divergence Scoring | Compare subgraph structures, forecast expected behaviors to compute deviation scores, and flag anomalies when exceed a threshold. | CAN Graph Forecasted and observed node data |
S US |
✓ ✗ |
SubgraphNode | [99] [12] |
|
| Statistical Comparison | Uses chi-squared test to compare graph features between normal and test populations | CAN Graph | SS | ◗ | Subgraph | [100] | |
| Statistical Graph Analysis + Time Series Analysis | Using ERGM to perform statistical analysis of Network Topology Graphs and ARMA model to perform time series analysis of coefficients | Network Traffic Graph | US | ✗ | Subgraph | [66] | |
| Graph Autoencoders | Graph Transformer-based | Transformer encoder and DNN decoder reconstruct edge attributes, with anomalies detected via reconstruction and ML model | Network Traffic Graph | US | ◗ | Edge | [70] |
| Temporal Graph Transformer | Adversarial autoencoder with graph and temporal attention | Network Traffic Graph | US | ✗ | Graph | [72] | |
| Multi-Modal Temporal Graph Transformer | Transformer-based model with spatial, temporal, & cross-modal attention; detects anomalies from multi-modal graph reconstruction | Microservice Twin Graph | SS | ✓ | Graph | [101] | |
| Others | Scoring/Weighting-based | Weights graph edges using anomaly scores from coarse logs, prunes low-weight edges, and extracts attack paths from the refined graph | Provenance Graph | S | ✗ | Subgraph | [87] |
| Aspect | Two-stage Approach | End-to-end Approach |
|---|---|---|
| Flexibility | High - can swap components | Low- unified architecture |
| Supervision Requirement | Flexible - support supervised, sem- and unsupervised learning | Often supervised - especially in GNN based models |
| Real-time Capability | Moderate - operates mainly in batch model and limited real time use | Better for real-time and streaming |
| Intepretability | Better - clear pipeline | Limited - black box |
| Computational Complexity and Scalability | Lower - simpler components and stage-wise optimization | Higher - complex models like GNN |
| Anomaly Granularity | Multi-level - detects node, edge, subgraph, and graph-level anomalies | Specialized - GNNs mostly focus on node/edge, non-GNNs on graph/subgraph |
| Best Use Case needed | Forensic or exploratory analysis where interpretability and flexibility are key | Real-time or adaptive intrusion detection emphasizing automation and speed |
| Type | Dataset | Detected Attacks/Anomalies | Description | Related Works |
|---|---|---|---|---|
| Network Datasets | CICIDS [143] | DoS, DDoS, Web Attack, Infiltration, Port scan, Botnet | Network traffic data provided in pcap and CSV formats, with extracted flow features | [19,60,70,74,76,77,78] |
| UNSW-NB15 [144] | Exploits, Reconnaissance, DoS, Generic, Shellcode, Fuzzers, Worms, Backdoors, Analysis | Consists of real benign and simulated attack activity traffic data logs collected from simulation environments | [51,65,72,76,77,85,92] | |
| Network Traffic | DDoS, CMP, UDP/TCP SYN Flood, Botnet attacks; IoT: ARP spoofing, Ping of Death, Smurf, DoS | Multiple datasets including DDoS with varying attack rates [145], CTU-13 Botnet [146], Kyoto 2006+ datasets [147], TOR-nonTOR [148], SUEE8 [149], smart home IoT traffic [57] | [57,69,83,85,109] | |
| Malware Traffic | Malware attacks | AndMal2019 [150] (malicious apps in smart devices), EncMal2021 [55] (user-submitted malicious executables) | [55] | |
| AWID-CLS-R [151] | Flooding, Impersonation, and Injection | Open-source dataset with MAC-layer traces of benign and malicious activities in 802.11 WLAN traffic. | [72] | |
| CERT [152] | Insider threats | Network logs for 100B+ behaviors of 4000 users | [81,84,97] | |
| IoT | Malware, Botnets, DoS, DDoS, Scanning, Password cracking, Ransomware, Injection, MITM, XSS, Backdoor | Traffic data, operational logs, and telemetry data collected from an IoT environment. Some IoT datasets are IoT-23, MedBIoT, Bot-IoT [153], ToN-IoT, NF-Bot-IoT, NF-ToN-IoT [154], IOST [155], ACI-IoT-2023 [156] and CICIoT2023 [157] | [12,58,62,67,68,70,71,74,75,78,79] | |
| Real-world Generated | APT, Trojan, Phishing, DNS Exfiltration, Brute force, Encrypted flooding/malicious traffic | Normal and network-level events were collected from an enterprise network, and an attack testbed generated multiple attack types | [63,66,73] | |
| CAN | DoS, Fuzzy, Suspension, Replay, Spoofing, Impersonation | Consist of CAN messages collected from an in-vehicular network. Some CAN datasets are CAN signal datasets [158] and OTIDS [159] which is a real vehicle datasets from HCRL | [98,100,106] | |
| Host Datasets | WUIL [160] | Data breaching | File system access dataset including 3050 normal and 222 attack samples | [95] |
| DARPA | APT, Phishing, Dictionary attacks, FTP-write, Port scan | IDS data with full packet captures, featuring the 1999 set, OpTC dataset for APT activities, and Engagement datasets (E3 and E5) simulating real-world APT scenarios | [46,52,59,85,89,90,91,92,93,106] | |
| ATLAS [161] | 10 types of APT attacks | Labeled dataset of logs collected from a real-world system which consists of 20,088 entities and 249k events per scenario | [52,93] | |
| StreamSpot [162] | Drive-by download attack | Consists of 500 benign graphs for 5 scenarios and 100 attack graphs for 1 attack scenario generated from data collected on Linux machines | [52,80,86,89,90,92,93,94,162] | |
| LANL [163] | APT attacks | Contains 1,648,275,307 events comprising benign events (authentication, network flow, DNS lookup) and attack events | [80,82,84,96] | |
| Unicorn [49] | APT attacks like malware downloads, remote code execution, and command injection | System provenance dataset collected using CamFlow, capturing system-level activities over three days with 125 benign graphs and 25 attack graphs. SC-1 and SC-2 versions available | [89,90,94] | |
| Production EDR [52] | APT attacks | Includes EDR logs from 18K endpoints and 59K extracted graphs | [52] | |
| Generated datasets | Attacks for enterprise systems such as reconnaissance, credential access, privilege escalation, file and system manipulation etc. | System logs containing simulated attack logs for 16 attack types | [87,103] | |
| Multi-modal Datasets | DeepTraLog [164] | Service fault anomalies | Consists of 132,485 traces and 7.7M log messages, where 17% of those are anomalous | [108] |
| BETH [165] | Unusual/malicious activities | Kernel-level process logs and network traffic | [88] | |
| MSDS [166] | Anomalies | Consist of traces, logs, and metrics from an OpenStack-distributed system | [101] | |
| AIOps-Challenge | Service, Pod and Node anomalies | Instance metrics and logs for 40 services on 6 servers in a microservice-based system | [101] | |
| TraceRCA and TT-ARM [167] | Anomalies | Datasets with traces and metrics from microservice-based systems | [107] |
| Methods | Description | Target Users | Related Works | |
|---|---|---|---|---|
| Model Explanations | Explainable AI (XAI) | PGExplainer for local and global explanations. | Security analyst,Researchers | [77] |
| Investigate Anomalous Results | Ranking | Rank the RSGs based on co-occurrence probability and select key indicators. | Security analysts | [52] |
| Filtering and Visualize | Subset of important nodes filter using LPR algorithm and presented in an anomalous log graph. | Security analysts | [37] | |
| For the detected anomalous graph, apply graph reduction techniques and community discovery algorithm to construct attack summary graphs and benign graphs. | System admins | [59] | ||
| Attack Story Reconstruction | Used the activation (output) of layers to get the training event trace similar to the input that generated the malicious prediction output. | Cyber defense team | [46] |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
