Preprint Article Version 1 Preserved in Portico This version is not peer-reviewed

Data-Driven ICS Network Simulation for Synthetic Data Generation

Version 1 : Received: 19 December 2023 / Approved: 19 December 2023 / Online: 19 December 2023 (06:01:34 CET)

How to cite: Kim, M.; Jeon, S.; Cho, J.; Gong, S. Data-Driven ICS Network Simulation for Synthetic Data Generation. Preprints 2023, 2023121391. https://doi.org/10.20944/preprints202312.1391.v1 Kim, M.; Jeon, S.; Cho, J.; Gong, S. Data-Driven ICS Network Simulation for Synthetic Data Generation. Preprints 2023, 2023121391. https://doi.org/10.20944/preprints202312.1391.v1

Abstract

Industrial Control Systems (ICS) are integral to managing and optimizing processes in various industries, including manufacturing, power generation, and more. However, the scarcity of widely adopted ICS datasets hampers research efforts in areas like optimization and security. This scarcity arises due to the substantial cost and technical expertise required to create physical ICS environments. In response to these challenges, this paper presents a groundbreaking approach to generating synthetic ICS data through a data-driven ICS network simulation. We circumvent the need for expensive hardware by recreating the entire ICS environment in software. Moreover, rather than manually replicating the control logic of ICS components, we leverage existing data to autonomously generate control logic. The core of our method involves the stochastic setting of setpoints, which introduces randomness into the generated data. Setpoints serve as target values for controlling the operation of the ICS process. This approach enables us to augment existing ICS datasets and cater to the data requirements of machine learning-based ICS intrusion detection systems and other data-driven applications. Our simulated ICS environment employs virtualized containers to mimic the behavior of real-world PLCs and SCADA systems, while control logic is deduced from publicly available ICS datasets. Setpoints are generated probabilistically to ensure data diversity. Experimental results validate the fidelity of our synthetic data, emphasizing its ability to closely replicate temporal and statistical characteristics of real-world ICS networks. In conclusion, this innovative data-driven ICS network simulation offers a cost-effective and scalable solution for generating synthetic ICS data. It empowers researchers in the field of ICS optimization and security with diverse, realistic datasets, furthering advancements in this critical domain. Future work may involve refining the simulation model and exploring additional applications for synthetic ICS data.

Keywords

Industrial Control Systems (ICS); Synthetic Data Generation; Data-Driven Simulation; Machine Learning; Cybersecurity

Subject

Computer Science and Mathematics, Computer Science

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.

Leave a public comment
Send a private comment to the author(s)
* All users must log in before leaving a comment
Views 0
Downloads 0
Comments 0
Metrics 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.