Submitted:
21 June 2023
Posted:
23 June 2023
You are already at the latest version
Abstract
Keywords:
1. Introduction
2. Background
2.1. Digital Forensics (DF)
2.2. ISO/IEC 27043
2.3. Smart Environments
2.4. Machine Learning (ML)
3. State-of-the-Art Use of Machine Learning Techniques in Digital Forensics
- IoTDots framework is proposed as a solution to deal with the large amounts of data collected by IoT devices and sensors.
- Automatic prioritisation of suspicious file artefacts methodology is proposed as a solution to deal with the growing volume of data and manual retrieval of suspicious files.
- Intelligent methods to automate problem-solving are proposed as a solution to deal with the massive amounts of data that must be analysed for digital evidence.
- Automation by using ML techniques for classification and using AI techniques for prioritising suspicious devices is proposed as a solution to deal with the growing number of cases needing DF competence and the large volumes of data to be processed.
- Automatic text analysis to detect online sexual predatory talks is proposed as a solution to deal with the growth of cybercrime targeting minors, the large volume of data, and the DFI process which is done primarily by hand.
- The "VERITAS" mechanism to automatically collect and extract forensic evidence from smart environments is proposed as a solution to deal with the large amounts of data that is generated in smart environments.
- An intelligent intrusion detection system to detect regular and malicious attacks on data created in smart environments was proposed as a solution to deal with the simple or complex attacks that face IoT networks in particular.
- A blockchain-assisted shared audit framework for identifying data-scavenging attacks in virtualised resources was proposed as a solution to deal with attacks and violation detection in smart environments.
- An intelligent forensic analysis mechanism was proposed as a solution to deal with the probability of continual attacks on IoT devices and the low processing power and memory of these devices.
4. The Impact of MLF on the DFI Process
5. Conclusion
References
- D. POPESCUL and L. RADU, "Data Security in Smart Cities: Challenges and Solutions", Informatica Economica, vol. 20, no. 12016, pp. 29-38, 2016. [CrossRef]
- D. Quick and K. Choo, "Big forensic data management in heterogeneous distributed systems: quick analysis of multimedia forensic data", Software: Practice and Experience, 2016. [CrossRef]
- S. Watson and A. Dehghantanha, "Digital forensics: the missing piece of the Internet of Things promise", Computer Fraud & Security, vol. 2016, no. 6, pp. 5-8, 2016. [CrossRef]
- X. Du et al., "SoK", Proceedings of the 15th International Conference on Availability, Reliability and Security, 2020. [CrossRef]
- V. Kebande, R. Ikuesan, N. Karie, S. Alawadi, K. Choo and A. Al-Dhaqm, "Quantifying the need for supervised machine learning in conducting live forensic analysis of emergent configurations (ECO) in IoT environments", Forensic Science International: Reports, vol. 2, p. 100122, 2020. [CrossRef]
- A. Valjarevic and H. Venter, "A Comprehensive and Harmonized Digital Forensic Investigation Process Model", Journal of Forensic Sciences, vol. 60, no. 6, pp. 1467-1483, 2015. [CrossRef]
- M. Conti, A. Dehghantanha, K. Franke and S. Watson, "Internet of Things security and forensics: Challenges and opportunities", Future Generation Computer Systems, vol. 78, pp. 544-546, 2018. [CrossRef]
- A. Valjarević, H. Venter and R. Petrović, "ISO/IEC 27043:2015 – Role and application", in 24th Telecommunications forum TELFOR, Serbia, Belgrade, 2016.
- M. Qadir and A. Varol, “The role of machine learning in Digital Forensics,” 2020 8th International Symposium on Digital Forensics and Security (ISDFS), 2020. [CrossRef]
- Goni, J. Mishion Gumpy, T. Umar Maigari, M. Muhammad and A. Saidu, "Cybersecurity and Cyber Forensics: Machine Learning Approach", Machine Learning Research, vol. 5, no. 4, p. 46, 2020. [CrossRef]
- S. Iqbal and S. Abed Alharbi, "Advancing Automation in Digital Forensic Investigations Using Machine Learning Forensics", Digital Forensic Science, 2020. [CrossRef]
- L. Babun, A. Sikder, A. Acar and A. Uluagac, "IoTDots: A Digital Forensics Framework for Smart Environments", arXiv.org, 2022. [Online]. Available: https://arxiv.org/abs/1809.00745.
- Du X, Scanlon M (2019) Methodology for the automated metadata-based classification of incriminating digital forensic artefacts. In Proceedings of the 14th International Conference on Availability, Reliability and Security 1-8. Link: https://bit.ly/2Oqh6u6.
- A. Krivchenkov, B. Misnevs and D. Pavlyuk, "Intelligent Methods in Digital Forensics: State of the Art", Lecture Notes in Networks and Systems, pp. 274-284, 2019. [CrossRef]
- L. Babun, A. Sikder, A. Acar and S. Uluagac, "The Truth Shall Set Thee Free: Enabling Practical Forensic Capabilities in Smart Environments", Proceedings 2022 Network and Distributed System Security Symposium, 2022. [CrossRef]
- P. Shakeel, S. Baskar, H. Fouad, G. Manogaran, V. Saravanan and C. Montenegro-Marin, "Internet of things forensic data analysis using machine learning to identify roots of data scavenging", Future Generation Computer Systems, vol. 115, pp. 756-768, 2021. [CrossRef]
- Ngejane, J. Eloff, T. Sefara and V. Marivate, "Digital forensics supported by machine learning for the detection of online sexual predatory chats", Forensic Science International: Digital Investigation, vol. 36, p. 301109, 2021. [CrossRef]
- G. Kalnoor and S. Gowrishankar, "IoT-based smart environment using intelligent intrusion detection system", Soft Computing, vol. 25, no. 17, pp. 11573-11588, 2021. [CrossRef]
- M. Mazhar et al., "Forensic Analysis on Internet of Things (IoT) Device Using Machine-to-Machine (M2M) Framework", Electronics, vol. 11, no. 7, p. 1126, 2022. [CrossRef]
- Y. Adam and C. Varol, “Intelligence in digital forensics process,” 2020 8th International Symposium on Digital Forensics and Security (ISDFS), 2020. [CrossRef]
- Z. Baig, M. A. Khan, N. Mohammad, and G. B. Brahim, “Drone forensics and machine learning: Sustaining the investigation process,” Sustainability, vol. 14, no. 8, p. 4861, 2022. [CrossRef]
- A. Jarrett and K. R. Choo, “The impact of automation and artificial intelligence on Digital Forensics,” WIREs Forensic Science, vol. 3, no. 6, 2021. [CrossRef]
- N. Koroniotis, N. Moustafa, and J. Slay, “A new intelligent satellite deep learning network forensic framework for Smart Satellite Networks,” Computers and Electrical Engineering, vol. 99, p. 107745, 2022. [CrossRef]
- S. Ferreira, M. Antunes, and M. E. Correia, “A dataset of photos and videos for Digital Forensics Analysis Using Machine Learning Processing,” Data, vol. 6, no. 8, p. 87, 2021. [CrossRef]
- Z. Shahbazi and Y.-C. Byun, “NLP-based digital forensic analysis for online social network based on system security,” International Journal of Environmental Research and Public Health, vol. 19, no. 12, p. 7027, 2022. [CrossRef]
- Y. A. Balushi, H. Shaker, and B. Kumar, “The use of machine learning in digital forensics: Review paper,” Proceedings of the 1st International Conference on Innovation in Information Technology and Business (ICIITB 2022), pp. 96–113, 2023. [CrossRef]
- Y. C. Tok and S. Chattopadhyay, “Identifying threats, cybercrime and digital forensic opportunities in smart city infrastructure via threat modeling,” Forensic Science International: Digital Investigation, vol. 45, p. 301540, 2023. [CrossRef]
- Gumusbas, T. Yldrm, A. Genovese, and F. Scotti, “A comprehensive survey of databases and deep learning methods for cybersecurity and Intrusion Detection Systems,” IEEE Systems Journal, vol. 15, no. 2, pp. 1717–1731, 2021. [CrossRef]
- Hussein Ali Sahib, M. Y. Alsudani, M. K. Ali, Haydar Qassim Abbas, K. Moorthy, and Myasar Mundher Adnan, “Proposed intelligence systems based on digital Forensics: Review paper,” vol. 80, pp. 2647–2651, Jan. 2023. [CrossRef]
- Salih and N. Dabagh, “Digital Forensic Tools: A literature review,” Journal of Education and Science, vol. 32, no. 1, pp. 109–124, 2023. [CrossRef]
- Palmese, A. E. C. Redondi, and M. Cesana, “Feature-Sniffer: Enabling IoT Forensics in OpenWrt based Wi-Fi Access Points,” arXiv.org, Feb. 14, 2023. https://arxiv.org/abs/2302.06991 (accessed June 5, 2023).



| Ref. No. | Paper title | Problem statement of the paper | The solution proposed by the paper |
|---|---|---|---|
| [12] | IoTDots: A Digital Forensics Framework for Smart Environments | The amount of data collected by IoT devices and sensors is immense and contains valuable forensic evidence. This data can help identify and prevent unauthorised access within smart environments. | A new framework known as IoTDots is designed to help protect the data collected by various smart devices and applications. It features two main components: the IoTDots Analyzer and the IoTDots modifier. The former scans the source code of the applications and detects forensic information. The latter automatically inserts tracking logs and reports the results. |
| [13] | Methodology for the Automated Metadata-Based Classification of Incriminating Digital Forensic Artefacts | One of the most discussed challenges in DFI is the growing volume of data. Since the majority of file artefacts on seized devices are usually irrelevant to the investigation, manually retrieving suspicious files relevant to the investigation is very difficult. | To reduce the amount of manual analysis required in DFI, this research proposed a methodology for the automatic prioritising of suspicious file artefacts. Rather than providing the final analysis results, this methodology aims to predict and recommend the artefacts that are likely to be suspicious. A supervised machine learning approach is used, which makes use of previously processed case results. |
| [14] | Intelligent Methods in Digital Forensics: State of the Art | One of the main issues with DF is that a massive amount of data must be analysed for digital evidence. The primary goal of this work is to improve this difficult forensic process by employing intelligent methods for analysing digital evidence. | The ability of computers to learn a specific task from data is known as "intelligent methods", which also includes data mining, machine learning, soft computing, and traditional artificial intelligence. This term is commonly used to express ways to automate problem solving in DF. In support of DF, two main intelligent approaches are utilised, namely rule-based and anomaly-based. |
| [15] | The Truth Shall Set Thee Free: Enabling Practical Forensic Capabilities in Smart Environments | The interaction of devices, users, and apps in smart environments creates a large amount of data. Such data contains valuable forensic information about smart environment events and activities. However, current smart platforms lack any digital forensic capability for identifying, tracing, storing, and analysing data generated in these environments. | "VERITAS", a novel and practical DF capability for smart environments, is introduced in this paper. The Collector and Analyzer are the two main components of VERITAS. The Collector employs mechanisms to automatically collect forensically relevant data from the smart environment. The Analyzer then uses a First Order Markov Chain model to extract valuable and usable forensic evidence from the collected data for the purposes of a forensic investigation. |
| [16] | Internet of Things Forensic Data Analysis using Machine Learning to Identify Roots of Data Scavenging | To detect attacks and violations in an IoT environment, intensive data analysis and computational intelligence are required. On such platforms, advanced computer systems based on machine learning and smart computing are used to identify enemies. | To discover and declare the presence of adversaries, DF necessitates intensive data analysis, such as retrieving and confirming system logs, blockchain information evaluation, and so on. The blockchain-assisted shared audit framework was proposed in this paper to analyse DF data in an IoT environment. It was created to identify the source and cause of data scavenging attacks in virtualised resources. It uses blockchain technology to manage access logs and controls. Using logistic regression ML and cross-validation, access log data is examined for the consistency of adversary event detection. |
| [5] | Quantifying the Need for Supervised Machine Learning in Conducting Live Forensic Analysis of Emergent Configurations (ECO) in IoT Environments | In an IoT system, particularly in the case of emergent configurations (ECOs), data might be dynamic, making it difficult to classify information during live forensics. In this sense, live forensics refers to a forensic investigation that is done in near real-time. | A conceptual framework based on supervised machine learning techniques was proposed in this paper. One of the advantages of using supervised ML techniques in live forensics is the ability of such techniques to predict possible events based on past occurrences. In addition, an automated feature identification was used to prevent redundancy throughout the feature selection and elimination. |
| [4] | SoK: Exploring the State of the Art and the Future Potential of Artificial Intelligence in Digital Forensic Investigation | The number of cases needing DF competence and the volume of data to be processed have overburdened digital forensic investigators. Many large data challenges are considered as having been solved by artificial intelligence. Automated evidence processing based on artificial intelligence techniques holds considerable potential for speeding up the digital forensic analysis process while improving case-processing capacity. | In DFI, automation uses ML techniques for classification. ML techniques can obtain important information for investigations more efficiently by exploiting existing digital evidence-processing knowledge. Also, digital evidence triage was developed for the prompt detection, processing and interpretation of digital evidence. Currently, with AI techniques, the investigator determines the priority of device gathering and processing at a crime scene. |
| [17] | Digital Forensics Supported by Machine Learning for the Detection of Online Sexual Predatory Chats | With the growth of cybercrime that targets minors, chat logs can be examined to detect and report harmful behaviour to law authorities. This can make a significant difference in protecting youngsters on social media platforms from being abused by cyber predators. Since DFI is done primarily by hand, the enormous volume and variety of data cause DF investigators to have a tough assignment. | To enable the automatic finding of hazardous talks in chat logs, the approach suggested in this research leverages a DF process model backed by ML methodologies. ML has previously been used successfully in the field of text analysis to detect online sexual predatory talks. |
| [18] | IoT-based Smart Environment Using Intelligent Intrusion Detection System | One of the most fundamental characteristics of any smart device in an IoT network is its ability to acquire a bigger set of data than has been produced and then send the obtained data to the destination/ receiver server through the internet. Thus, IoT-based networks are particularly vulnerable to simple or sophisticated assaults, which must be discovered early in the data transmission process in order to protect the network against these hostile attacks. | The primary purpose of this study was to develop and build an intelligent intrusion detection system utilising machine learning models so that assaults in the IoT network may be discovered. The approach was designed to account for both regular and malicious attacks on data created in an IoT smart environment. |
| [19] | Forensic Analysis on Internet of Things (IoT) Device Using Machine-to-Machine (M2M) Framework | The adaptability of IoT devices raises the probability of continual attacks on them. Due to the low processing power and memory of IoT devices, security researchers have found it challenging to preserve records of diverse attacks performed on these devices during a DFI. | An intelligent forensic analysis mechanism is proposed for the automatic detection of attacks on IoT devices based on the machine-to-machine framework. However, the proposed mechanism combines several ML techniques and different forensic analysis tools to detect different types of attacks. Furthermore, by providing a third-party logging server, the problem of evidence gathering has been overcome. To assess the effect and type of attacks and violations, forensic analysis is done on logs utilising a forensic server. |
| [30] | Digital Forensic Tools: A Literature Review | The growth of IoT devices that produce a large amount of forensic data presents the main challenge of IoT forensics. | A smart fridge was selected as an IoT device to examine and investigate. The dataset was examined using two ML algorithms, Bayes net, and decision stump. Each algorithm represents a distinct idea, a stump tree, which is a simple version of the decision tree ML technique. The Bayes net is useful for estimating the likelihood of numerous recognised causes, one of which was the occurrence of an event. The validation results indicate that Bayes net algorithm is more accurate than the decision stump tree. |
| [31] | Feature-Sniffer: Enabling IoT Forensics in OpenWrt based Wi-Fi Access Points | IoT forensics and smart environments with their recognised challenges create a great opportunity to develop new forensic tools to make the task of forensic investigators easier, which can be used for acquiring, preserving, and also analysing such forensic data. | A user-friendly tool was proposed for smart devices that support WiFi and used in smart environment scenarios to allow forensic investigators, network administrators, and data scientists access to various features of network traffic with simple steps. The proposed tool allows network traffic features to be computed in real-time on any WiFi access point running the OpenWrt firmware, avoiding the time-consuming tasks of dumping network traffic and implementing the needed procedures to analyse the captured traffic. |
| Reference No. | Used ML technique | Readiness processes | Initialisation processes | Acquisitive processes | Investigative processes | Concurrent processes |
|---|---|---|---|---|---|---|
| [12] | Markov Chain model | X | ||||
| [17] | Logistic regression | X | ||||
| [5] | Supervised machine learning | X | X | |||
| [13] | Supervised machine learning | X | ||||
| [14] | Unsupervised identification | X | X | |||
| [15] | Markov Chain model | X | ||||
| [16] | Logical regression | X | X | |||
| [18] | Markov chain model | X | ||||
| [19] | Decision tree algorithm | X |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
