Preprint
Review

This version is not peer-reviewed.

A Systematic Literature Review on Machine Learning for Intrusion Detection Systems

Submitted:

15 May 2026

Posted:

19 May 2026

You are already at the latest version

Abstract
The use of Artificial Intelligence (AI) and Machine Learning (ML) in cybersecurity, especially for creating Intrusion Detection Systems (IDS), has become increasingly important. These systems are essential for detecting malicious behaviour, identifying network issues, and stopping cyberattacks in real time. Although extensive research has been conducted on various ML and Deep Learning (DL) models for IDS, the current literature remains incomplete. It has many different datasets, methods, and evaluation standards. As cyber threats become more advanced, it is crucial to conduct a thorough analysis of ML techniques for intrusion detection. The goal of this Systematic Literature Review (SLR) is to give a full picture of the most recent academic articles on ML-based IDS. The study addresses important research questions about the most widely used algorithms, the types of attacks and network environments covered, the methodological problems that remain unsolved, and the new trends that should shape future research. Following the PRISMA framework, we conducted a systematic review of peer-reviewed articles published between January 2022 and May 2025. We searched IEEE Xplore, ACM Digital Library, and SpringerLink, yielding 22,558 initial records. After carefully applying strict inclusion criteria, 125 papers were selected for the final analysis. We created a standardised data extraction form (i.e., using MS Excel) to gather bibliographic details, research emphasis, methodological strategies, datasets, evaluation criteria, and recognised constraints. We employed thematic analysis to develop a clear taxonomy. We identified five main research themes in our analysis: (1) ensemble and hybrid learning pipelines focused on performance optimisation (30 papers), (2) context-specific IDS designs for Internet of Things (IoT), cloud, and Software-Defined Networking (SDN) environments (34 papers), (3) data-centric engineering that deals with class imbalance and feature selection (20 papers), (4) deep neural architectures for representation learning (31 papers), and (5) trustworthiness concerns like adversarial robustness, zero-day detection, and Explainable AI (XAI) (10 papers). Convolutional Neural Networks (CNNs), Long Short-Term Memory (LSTM), and Random Forests are the most commonly used algorithms, often combined. Nonetheless, significant deficiencies remain: about 2% of papers incorporate XAI, only 4% focus on adversarial robustness, and none validate their models in real-world production settings. Denial of Service (DoS) and Distributed DoS (DDoS) are the most common types of attacks in the literature, while Web attacks, ransomware, and advanced persistent threats remain poorly studied. The number of publications grows at an average of 30.2% annually, but the field still relies on legacy benchmark datasets rather than operational validation.
Keywords: 
;  ;  ;  ;  ;  ;  ;  
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

Disclaimer

Terms of Use

Privacy Policy

Privacy Settings

© 2026 MDPI (Basel, Switzerland) unless otherwise stated