Submitted:
22 May 2023
Posted:
23 May 2023
You are already at the latest version
Abstract
Keywords:
1. Introduction
2. Background
3. Evolution of Ransomware
- Distribution Campaign: The attacker silently induces the victim to download the infection-starting dropper code. The attacker uses methods including email phishing, social engineering, and others.
- Malicious code injection: During this phase, the target’s computer is infected with ransomware, and malicious code is downloaded.
- Malicious Payload Staging: Ransomware sets up persistence by inserting the system.
- Scan Checks for encryption on the target computer and any network-accessible resources.
- Encryption: The process of encrypting all of the selected documents begins.
- Payday: They cannot access the victim’s data, and a notification seeking payment is visible on the screen of the targeted device.
4. Ransomware Detection Techniques.
4.1. Signature-Based Detection
4.2. Heuristic-Based Detection
4.3. Machine Learning-Based Detection
4.4. Network-Based Detection
4.5. Hybrid Detection
- Accuracy: Accuracy is the most straightforward evaluation metric, representing the percentage of correct predictions made by the model. It is calculated as the ratio of accurate predictions to the total number of predictions. However, accuracy can be misleading when dealing with imbalanced datasets, where negative samples greatly outweigh the positive models [19,20].
- Precision: Out of all samples predicted to be positive (recognized as ransomware by the algorithm), precision is the percentage of true positives (samples of malware successfully identified). The ratio of true positives to the total of true and false positives is known as precision. A model with a high precision score will have a low false positive rate, making it less likely to label innocent files as ransomware mistakenly [20].
- Recall: Recall counts the number of positive samples in the collection that are true positives. The ratio of true positives to true and false negatives is computed. A high recall score suggests that the model has a low incidence of false negatives, which makes it less likely to fail to detect actual ransomware samples [20,21].
- ROC curve: The performance of a binary classifier as the discrimination threshold is changed is graphically represented by the receiver operating characteristic (ROC) curve. At various threshold values, it plots the actual positive rate (TPR) versus the false positive rate (FPR). The model’s overall performance is assessed using the area under the ROC curve (AUC), with higher AUC values indicating better performance [22].
5. Machine Learning for Ransomware Detection
5.1. Feature Extraction and Selection
5.1.1. Features Used for Ransomware Detection
- File access patterns: Ransomware typically accesses and encrypts files in a particular pattern, such as alphabetical order, extension type, or creation date. This behavior can be detected using file access patterns as features [31].
- System calls: Ransomware commonly uses system calls to perform its malicious activities, such as reading and writing files, creating processes, and network communication. System call traces can be extracted and used as features for detection [23].
- Network traffic: A command-and-control (C&C) server is frequently used by ransomware to deliver and receive orders. The analysis of network traffic can provide valuable features for detecting ransomware [27].
- Behavioral analysis: Behavioral analysis involves monitoring the behavior of running processes and identifying anomalies that indicate malicious activity. Features such as process creation, termination, and file access can be used [1].
- Static analysis: Static analysis involves looking at the executable file’s source code to spot malicious activity. Features such as code size, entropy, and string patterns can be used for this purpose [17].
5.1.2. Feature Selection Techniques
- Principal component analysis: This technique is used to reduce the dimensionality of a dataset by identifying the most critical features that explain the majority of the variance in the data. Principles component analysis can help identify redundant or irrelevant features and select the most informative ones for ransomware detection [28].
- Correlation analysis: Correlation analysis is a technique used to identify the correlation between features in a dataset. Highly correlated features may be redundant and can be removed to simplify the model and improve performance [11].
5.2. Machine Learning Detection Studies
6. Data Collection and Preprocessing
7. Challenge and Future Directions
- Rapidly evolving ransomware: Ransomware is a constantly changing threat, with new variants and attack techniques being developed regularly. This makes it challenging to build machine learning models that can detect all ransomware accurately and quickly [47].
- Adversarial attacks involve modifying the input data to bypass the machine learning model’s detection capabilities. Malicious attacks can be used to evade ransomware detection systems, making the systems less effective [47].
- Real-time detection requirements: Ransomware can spread rapidly and cause significant damage within a short time. Therefore, ransomware detection systems must be able to detect ransomware in real-time to prevent further spread and damage [48].
7.1. Future Work in This Field
- Incorporating real-time detection capabilities: Ransomware detection systems must incorporate real-time detection capabilities to quickly identify and prevent ransomware attacks. This can be achieved through the use of real-time monitoring and analysis techniques [46].
- Collaboration and sharing of data: Collaboration and sharing of data among researchers and organizations can help develop more effective ransomware detection systems. This can help build more comprehensive datasets for training and testing machine learning models [47].
- Developing effective machine learning-based ransomware detection systems is challenging for several reasons. However, with advanced techniques and collaboration among researchers and organizations, it is possible to develop more robust and accurate ransomware detection systems [45].
8. Conclusion
References
- Celdrán, A.H.; Sánchez, P.M.S.; Castillo, M.A.; Bovet, G.; Pérez, G.M.; Stiller, B. Intelligent and behavioral-based detection of malware in IoT spectrum sensors. International Journal of Information Security 2022, 1–21. [Google Scholar] [CrossRef]
- Chesti, I.A.; Humayun, M.; Sama, N.U.; Jhanjhi, N. Evolution, mitigation, and prevention of ransomware. In Proceedings of the 2020 2nd International Conference on Computer and Information Sciences (ICCIS). IEEE; 2020; pp. 1–6. [Google Scholar]
- Philip, K.; Sakir, S.; Domhnall, C. Evolution of ransomware. IET Netw 2018, 7, 321–327. [Google Scholar]
- Jegede, A.; Fadele, A.; Onoja, M.; Aimufua, G.; Mazadu, I.J. Trends and Future Directions in Automated Ransomware Detection. Journal of Computing and Social Informatics 2022, 1, 17–41. [Google Scholar] [CrossRef]
- Brewer, R. Ransomware attacks: detection, prevention and cure. Network Security 2016, 2016, 5–9. [Google Scholar] [CrossRef]
- Bello, I.; Chiroma, H.; Abdullahi, U.A.; Gital, A.Y.; Jauro, F.; Khan, A.; Okesola, J.O.; Abdulhamid, S.M. Detecting ransomware attacks using intelligent algorithms: Recent development and next direction from deep learning and big data perspectives. Journal of Ambient Intelligence and Humanized Computing 2021, 12, 8699–8717. [Google Scholar] [CrossRef]
- Scaife, N.; Carter, H.; Traynor, P.; Butler, K.R. Cryptolock (and drop it): stopping ransomware attacks on user data. In Proceedings of the 2016 IEEE 36th international conference on distributed computing systems (ICDCS). IEEE; 2016; pp. 303–312. [Google Scholar]
- Sgandurra, D.; Muñoz-González, L.; Mohsen, R.; Lupu, E.C. Automated dynamic analysis of ransomware: Benefits, limitations and use for detection. arXiv 2016, arXiv:1609.03020 2016. [Google Scholar]
- Prakash, K.P.; Nafis, T.; Biswas, S.S. Preventive Measures and Incident Response for Locky Ransomware. International Journal of Advanced Research in Computer Science 2017, 8. [Google Scholar]
- Paquet-Clouston, M.; Haslhofer, B.; Dupont, B. Ransomware payments in the bitcoin ecosystem. Journal of Cybersecurity 2019, 5, tyz003. [Google Scholar] [CrossRef]
- Kok, S.; Abdullah, A.; Jhanjhi, N.; Supramaniam, M. Ransomware, threat and detection techniques: A review. Int. J. Comput. Sci. Netw. Secur 2019, 19, 136. [Google Scholar]
- Thakran, E.; Kumari, A. Impact of “Ransomware” on critical infrastructure due to pandemic. Available at SSRN 4361110 2023. [Google Scholar] [CrossRef]
- Ahmed, Y.A.; Huda, S.; Al-rimy, B.A.S.; Alharbi, N.; Saeed, F.; Ghaleb, F.A.; Ali, I.M. A weighted minimum redundancy maximum relevance technique for ransomware early detection in industrial IoT. Sustainability 2022, 14, 1231. [Google Scholar] [CrossRef]
- Akhtar, M.S.; Feng, T. Malware Analysis and Detection Using Machine Learning Algorithms. Symmetry 2022, 14, 2304. [Google Scholar] [CrossRef]
- Sharmeen, S.; Ahmed, Y.A.; Huda, S.; Koçer, B.Ş.; Hassan, M.M. Avoiding future digital extortion through robust protection against ransomware threats using deep learning based adaptive approaches. IEEE Access 2020, 8, 24522–24534. [Google Scholar] [CrossRef]
- Swami, S.; Swami, M.; Nidhi, N. Ransomware Detection System and Analysis Using Latest Tool. International Journal of Advanced Research in Science, Communication and Technology 2021, 7, 2581–9429. [Google Scholar] [CrossRef]
- Yamany, B.; Elsayed, M.S.; Jurcut, A.D.; Abdelbaki, N.; Azer, M.A. A New Scheme for Ransomware Classification and Clustering Using Static Features. Electronics 2022, 11, 3307. [Google Scholar] [CrossRef]
- Yamany, B.; Azer, M.A.; Abdelbaki, N. Ransomware Clustering and Classification using Similarity Matrix. In Proceedings of the 2022 2nd International Mobile, Intelligent, 2022, and Ubiquitous Computing Conference (MIUCC). IEEE; pp. 41–46.
- Kok, S.; Azween, A.; Jhanjhi, N. Evaluation metric for crypto-ransomware detection using machine learning. Journal of Information Security and Applications 2020, 55, 102646. [Google Scholar] [CrossRef]
- Masum, M.; Faruk, M.J.H.; Shahriar, H.; Qian, K.; Lo, D.; Adnan, M.I. Ransomware classification and detection with machine learning algorithms. In Proceedings of the 2022 IEEE 12th Annual Computing and Communication Workshop and Conference (CCWC). IEEE; 2022; pp. 0316–0322. [Google Scholar]
- Azmoodeh, A.; Dehghantanha, A.; Conti, M.; Choo, K.K.R. Detecting crypto-ransomware in IoT networks based on energy consumption footprint. Journal of Ambient Intelligence and Humanized Computing 2018, 9, 1141–1152. [Google Scholar] [CrossRef]
- Edis, D.; Hayman, T.; Vatsa, A. Understanding Complex Malware. In Proceedings of the 2021 IEEE Integrated STEM Education Conference (ISEC). IEEE; 2021; pp. 1–2. [Google Scholar]
- Ullah, F.; Javaid, Q.; Salam, A.; Ahmad, M.; Sarwar, N.; Shah, D.; Abrar, M. Modified decision tree technique for ransomware detection at runtime through API Calls. Scientific Programming 2020, 2020. [Google Scholar] [CrossRef]
- Khammas, B.M. Ransomware detection using random forest technique. ICT Express 2020, 6, 325–331. [Google Scholar] [CrossRef]
- Ghouti, L.; Imam, M. Malware classification using compact image features and multiclass support vector machines. IET Information Security 2020, 14, 419–429. [Google Scholar] [CrossRef]
- Arunkumar, M.; Kumar, K.A.
- Madani, H.; Ouerdi, N.; Boumesaoud, A.; Azizi, A. Classification of ransomware using different types of neural networks. Scientific Reports 2022, 12, 1–11. [Google Scholar] [CrossRef]
- Arivudainambi, D.; KA, V.K.; Visu, P.; et al. Malware traffic classification using principal component analysis and artificial neural network for extreme surveillance. Computer Communications 2019, 147, 50–57. [Google Scholar]
- Hwang, J.; Kim, J.; Lee, S.; Kim, K. Two-stage ransomware detection using dynamic analysis and machine learning techniques. Wireless Personal Communications 2020, 112, 2597–2609. [Google Scholar] [CrossRef]
- Dargahi, T.; Dehghantanha, A.; Bahrami, P.N.; Conti, M.; Bianchi, G.; Benedetto, L. A Cyber-Kill-Chain based taxonomy of crypto-ransomware features. Journal of Computer Virology and Hacking Techniques 2019, 15, 277–305. [Google Scholar] [CrossRef]
- Sheen, S.; Asmitha, K.; Venkatesan, S. R-Sentry: Deception based ransomware detection using file access patterns. Computers and Electrical Engineering 2022, 103, 108346. [Google Scholar] [CrossRef]
- Zahra, A.; Shah, M.A. IoT based ransomware growth rate evaluation and detection using command and control blacklisting. In Proceedings of the 2017 23rd international conference on automation and computing (icac). IEEE; 2017; pp. 1–6. [Google Scholar]
- Shaukat, S.K.; Ribeiro, V.J. RansomWall: A layered defense system against cryptographic ransomware attacks using machine learning. In Proceedings of the 2018 10th international conference on communication systems &, 2018, networks (COMSNETS). IEEE; pp. 356–363.
- Makinde, O.; Sangodoyin, A.; Mohammed, B.; Neagu, D.; Adamu, U. Distributed network behaviour prediction using machine learning and agent-based micro simulation. In Proceedings of the 2019 7th International Conference on Future Internet of Things and Cloud (FiCloud). IEEE; 2019; pp. 182–188. [Google Scholar]
- Almashhadani, A.O.; Kaiiali, M.; Sezer, S.; O’Kane, P. A multi-classifier network-based crypto ransomware detection system: A case study of locky ransomware. IEEE access 2019, 7, 47053–47067. [Google Scholar] [CrossRef]
- Singh, A.; Ikuesan, R.A.; Venter, H. Ransomware detection using process memory. arXiv 2022, arXiv:2203.16871 2022. [Google Scholar] [CrossRef]
- Silva, J.A.H.; Hernández-Alvarez, M. Large scale ransomware detection by cognitive security. In Proceedings of the 2017 IEEE Second Ecuador Technical Chapters Meeting (ETCM). IEEE; 2017; pp. 1–4. [Google Scholar]
- Modi, J. Detecting ransomware in encrypted network traffic using machine learning. PhD thesis, 2019.
- Ameer, M. Android ransomware detection using machine learning techniques to mitigate adversarial evasion attacks. Capital University of Science and Technology, Islamabad, Pakistan 2019.
- Talabani, H.S.; Abdulhadi, H.M.T. Bitcoin ransomware detection employing rule-based algorithms. Science Journal of University of Zakho 2022, 10, 5–10. [Google Scholar] [CrossRef]
- Adamu, U.; Awan, I. Ransomware prediction using supervised learning algorithms. In Proceedings of the 2019 7th International Conference on Future Internet of Things and Cloud (FiCloud). IEEE; 2019; pp. 57–63. [Google Scholar]
- Modi, J. Detecting ransomware in encrypted network traffic using machine learning. PhD thesis, 2019.
- Wan, Y.L.; Chang, J.C.; Chen, R.J.; Wang, S.J. Feature-selection-based ransomware detection with machine learning of data analysis. In Proceedings of the 2018 3rd international conference on computer and communication systems (ICCCS). IEEE; 2018; pp. 85–88. [Google Scholar]
- Alzahrani, A.; Alshehri, A.; Alshahrani, H.; Alharthi, R.; Fu, H.; Liu, A.; Zhu, Y. Randroid: Structural similarity approach for detecting ransomware applications in android platform. In Proceedings of the 2018 IEEE International Conference on Electro/Information Technology (EIT). IEEE; 2018; pp. 0892–0897. [Google Scholar]
- Beaman, C.; Barkworth, A.; Akande, T.D.; Hakak, S.; Khan, M.K. Ransomware: Recent advances, analysis, challenges and future research directions. Computers & security 2021, 111, 102490. [Google Scholar]
- McIntosh, T.; Kayes, A.; Chen, Y.P.P.; Ng, A.; Watters, P. Ransomware mitigation in the modern era: A comprehensive review, research challenges, and future directions. ACM Computing Surveys (CSUR) 2021, 54, 1–36. [Google Scholar] [CrossRef]
- Aboaoja, F.A.; Zainal, A.; Ghaleb, F.A.; Al-rimy, B.A.S.; Eisa, T.A.E.; Elnour, A.A.H. Malware detection issues, challenges, and future directions: A survey. Applied Sciences 2022, 12, 8482. [Google Scholar] [CrossRef]
- Gorment, N.Z.; Selamat, A.; Cheng, L.K.; Krejcar, O. Machine Learning Algorithm for Malware Detection: Taxonomy, Current Challenges and Future Directions. IEEE Access 2023. [Google Scholar] [CrossRef]
- Kapoor, A.; Gupta, A.; Gupta, R.; Tanwar, S.; Sharma, G.; Davidson, I.E. Ransomware detection, avoidance, and mitigation scheme: a review and future directions. Sustainability 2021, 14, 8. [Google Scholar] [CrossRef]






| References | Year | Name of the ransomware | Description |
|---|---|---|---|
| [4] | 1989 | AIDS Trojan | The first known ransomware attack, the AIDS Trojan, was distributed on floppy disks and demanded a payment of $ 189 to unlock infected files. |
| [5] | 2012 | Reveton | ransomware that posed as law enforcement and demanded payment for supposed illegal activities. |
| [7] | 2013 | CryptoLocker | one of the first widespread ransomware attacks that used encryption to lock victims’ files. |
| [8] | 2014 | CryptoWall | A variant of CryptoLocker that caused millions of dollars in damages. |
| [3] | 2015 | TeslaCrypt | A ransomware strain that targeted gamers and encrypted game-related files. |
| [9] | 2016 | Locky | Ransomware that was spread through malicious email attachments. |
| [3] | 2017 | WannaCry | A ransomware attack affecting over 200,000 systems across 150 different countries. |
| [10] | 2018 | SamSam | A ransomware attack that targeted hospitals, municipalities, and other organizations. |
| [3] | 2019 | Ryuk | A ransomware attack that caused significant damage to several companies and organizations. |
| [11] | 2020 | Maze | A ransomware attack that encrypted victims’ files and threatened to leak sensitive data if the ransom was not paid. |
| [3] | 2021 | REvil/Sodinokibi | A ransomware attack that targeted Kaseya, a software company, and affected over 1,500 businesses worldwide. |
| [12] | 2022 | Royal Ransomware | A ransomware attack that encrypted victims and demands a ransom payment in order to decrypt them, targets businesses, governments, and healthcare organizations, and the victims are mostly from the United States. |
| [12] | 2023 | LockBit Ransomware | A ransomware attack that encrypts the files and demands payment in exchange for the decryption key. often in conjunction with phishing emails or other social engineering techniques. |
| References | Algorithms | Characteristics |
|---|---|---|
| [23,24] | Decision trees | Decision trees can be trained on features such as file modifications, network traffic, and system calls to distinguish between ransomware and benign software behavior. The decision tree that results can then be used to determine whether new data contains ransomware. |
| [23,24] | Random forests | In order to guarantee that each tree in the forest has the same distribution and is dependent on the values of a randomly selected random vector, this strategy uses an ensembled method that combines tree predictors.Performance may be enhanced in comparison to standalone decision trees.Using a network of decision trees, the random forest approach is used to select and forecast the input data type. |
| [25,26] | Support vector machines | Support vector machines can be trained on features such as system calls, network traffic, and file behavior to distinguish between ransomware and benign software behavior. After that, it is possible to determine whether new data constitutes ransomware using the resultant Support vector machines. Support vector machines are handy when the data is high-dimensional and non-linearly separable, often in ransomware detection. |
| [27,28] | Neural networks | Like a biological brain, neural networks can find patterns in vast amounts of data. After getting the raw input, multi-layer neural network algorithms performed internal operations to extract and choose features. They have a mechanism for feature extraction and selection as a result. An input layer, an output layer holding the categorized variables, and a hidden layer comprise a primary neural network. The layers create an interconnected network of neurons. |
| References | Year | Author | Resolved the Issue | Utilized Technique | Result | Limitation |
|---|---|---|---|---|---|---|
| [32] | 2017 | Zahra & Sha | Detecting a ransomware attack using Cryptowall | Blocklisting of command and control (C and C) servers | The web proxy server, which acts as the TCP/IP traffic gateway, extracts the TCP/IP header. | The model’s efficacy and precision in identifying ransomware and its attack techniques against various operating system environments were not demonstrated through implementation. |
| [33] | 2018 | Shaukat & Ribeiro | detection of ransomware | (RansomWall) A layered and hybrid mechanism | effective at identifying zero-day attacks | N/A |
| [34] | 2019 | Makinde et al. | To determine whether an actual network system is vulnerable to a ransomware assault | Learning Machines | Correlation greater than 0.8 | It imitated the behavior of a small group of users. |
| [35] | 2019 | Ahmad et al. | Differentiating Locky ransomware users | Utilizing parallel classifiers, a behavioral approach to ransomware detection | Highly reliable detection with a low proportion of false positives | N/A |
| [36] | 2022 | Singh et al. | Discovery of new ransomware families and classification of newly discovered ransomware assaults | Checks process memory access privileges to enable rapid and accurate malware detection | Between 81.38% and 96.28% accuracy. | N/A |
| References | Year | Author | Problem Addressed | Method Used | Result |
|---|---|---|---|---|---|
| [25] | 2017 | Rahman and Hasan | Enhanced ransomware detection method | Using support vector machines as an analysis tool | Better ransomware detection is achieved with an integrated approach than static or dynamic analysis used separately. |
| [21] | 2018 | Dehghantanha et al. | Windows ransomware detection that is quick and accurate | Netconverse (classifier using j48 decision tree) | 97.1% actual positive detection rate |
| [38] | 2019 | Jasmin | Separating ransomware traffic and regular traffic | Algorithms used in logistic regression include random forest and support vector machine. | The best detection rate is 99.9% for the random forest, with 0% false positives. |
| [39] | 2019 | Ameer | Detection of ransomware. | Analyses that are static and dynamic. | 100% detection and classification precision |
| [24] | 2020 | Khammas | Detection of ransomware. | Random forest method. | 97.74% of samples are detected. |
| [29] | 2020 | Hwang et al. | An improved method of detecting ransomware. | Random forest and Markov models | 97.3% overall accuracy, 4.8% for false positives, and 1.5% for false negatives. |
| [40] | 2022 | Talabani and Abdulhadi | Tools for detecting ransomware that involves data mining and machine learning approaches have poor accuracy. | Decision Table and PARTially Decided Decision Tree. | Recall (96%), accuracy (96.01%), F-measure (95.6%), and precision (95.9%) |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
