Technical Note
Version 2
Preserved in Portico This version is not peer-reviewed
SG-ColBase: A Relational Column Database Kernel
Version 1
: Received: 11 November 2022 / Approved: 11 November 2022 / Online: 11 November 2022 (07:35:23 CET)
Version 2 : Received: 12 November 2022 / Approved: 14 November 2022 / Online: 14 November 2022 (03:02:09 CET)
Version 2 : Received: 12 November 2022 / Approved: 14 November 2022 / Online: 14 November 2022 (03:02:09 CET)
How to cite: Zhou, T.; Liu, T. SG-ColBase: A Relational Column Database Kernel. Preprints 2022, 2022110220. https://doi.org/10.20944/preprints202211.0220.v2 Zhou, T.; Liu, T. SG-ColBase: A Relational Column Database Kernel. Preprints 2022, 2022110220. https://doi.org/10.20944/preprints202211.0220.v2
Abstract
At present, diversified and highly concurrent businesses in the Internet industry often require heterogeneous databases formed by multiple databases to meet the needs. This report introduces database kernel SG-ColBase we developed. After achieving read and write concurrency control, data rollback, atomic log writing, and downtime data redo to ensure complete transaction support. The parallelism of database kernel execution is extended through field level locks and snapshot reads. Use the Bloom filter, resource cache pool, memory pool, skip list, non blocking log cache, and asynchronous data writing mechanism to improve the overall execution efficiency of the system. In terms of data storage, column storage, logical key and LSM-tree are introduced. While improving the data compression ratio and reducing data gaps, all disk data operations are written in incremental order. With the characteristics of asynchronous batch operation, the data writing speed is greatly improved. Thanks to the continuous feature of vertical data brought by column storage, the disk scanning brought by vertical traversal is reduced, which is a qualitative leap in efficiency compared with traditional relational databases in the big data analysis scenario. SG-ColBase can reduce the use of heterogeneous databases in business and improve R&D efficiency.
Keywords
Relational Database; Columnar Storage; Bloom Filter; Skip List; Field Level Lock; Read Write Concurrency; OLTP; OLAP; LSM-Tree; Token Bucket Algorithm
Subject
Computer Science and Mathematics, Information Systems
Copyright: This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Comments (1)
We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.
Leave a public commentSend a private comment to the author(s)
* All users must log in before leaving a comment
Commenter: Tingzhen Liu
Commenter's Conflict of Interests: Author
Add "6. Resource Cache PoolImplementation - Figure 3"