Preprint Article Version 1 Preserved in Portico This version is not peer-reviewed

Graph Stream Compression Scheme Based on Pattern Dictionary Using Provenance

Version 1 : Received: 19 April 2024 / Approved: 19 April 2024 / Online: 19 April 2024 (15:53:55 CEST)

How to cite: Lee, H.; Shin, B.; Choi, D.; Lim, J.; Bok, K.; Yoo, J. Graph Stream Compression Scheme Based on Pattern Dictionary Using Provenance. Preprints 2024, 2024041359. https://doi.org/10.20944/preprints202404.1359.v1 Lee, H.; Shin, B.; Choi, D.; Lim, J.; Bok, K.; Yoo, J. Graph Stream Compression Scheme Based on Pattern Dictionary Using Provenance. Preprints 2024, 2024041359. https://doi.org/10.20944/preprints202404.1359.v1

Abstract

With recent advancements in network technology and spread of Internet, use of social network services and Internet of Things devices has flourished, leading to a continuous generation of large volumes of graph stream data, where changes, such as additions or deletions of vertices and edges occur over time. Additionally, owing to the need for efficient use of storage space and security requirements, graph stream data compression has become essential in various applications. Even though various studies on graph compression methods have been conducted, most of them do not fully reflect the dynamic characteristics of graph streams and complexity of large graphs. In this paper, we propose a compression scheme using provenance data to efficiently process and analyze large graph stream data. It obtains provenance data by analyzing graph stream data and builds a pattern dictionary based on this to perform dictionary-based compression. By improving the existing dictionary-based graph compression methods, it enables more efficient dictionary management through tracking pattern changes and evaluating their importance using provenance. Furthermore, it considers the relationships among sub-patterns using an FP-tree and performs pattern dictionary management that updates pattern scores based on time. Our experiments show that the proposed scheme outperforms existing graph compression methods in key performance metrics, such as compression rate and processing time.

Keywords

Graph stream; Graph compression; Provenance data; Pattern dictionary; FP-tree

Subject

Computer Science and Mathematics, Computer Science

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.

Leave a public comment
Send a private comment to the author(s)
* All users must log in before leaving a comment
Views 0
Downloads 0
Comments 0
Metrics 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.