Submitted:
14 April 2025
Posted:
15 April 2025
You are already at the latest version
Abstract
Keywords:
1. Introduction
- We demonstrate that different high-frequency words may appear at various stages of the workload.
- We design a dynamic compression algorithm that can track high-frequency words in cache lines at different stages and dynamically update a high-frequency word table based on the counts.
- We implement our algorithm on Gem and NVMain and demonstrate its effectiveness.
2. Background and Motivation
2.1. Basics of NVM
2.2. Existing Solutions
2.3. Motivation
3. System design and implementation
3.1. Overview
3.2. Operational Workflow
| Algorithm 1 Dynamic Word Compression Algorithm |
|
| Algorithm 2 Dynamic Word Decompression Algorithm |
|
3.3. Discussion and Overhead Analysis
4. Evaluation
4.1. Experimental Setup
4.2. Results of Compression Ratio
4.3. Results of lifetime
4.4. Results of latency
5. Related Works
6. Conclusions
Author Contributions
Data Availability Statement
Conflicts of Interest
References
- Reinsel, D.; Gantz, J.; Rydning, J. Data age 2025: the digitization of the world from edge to core. IDC white paper 2018, 1, 1–29. [Google Scholar]
- Hassan, H.; Patel, M.; Kim, J.S.; Yaglikci, A.G.; Vijaykumar, N.; Ghiasi, N.M.; Ghose, S.; Mutlu, O. Crow: A low-cost substrate for improving dram performance, energy efficiency, and reliability. In Proceedings of the Proceedings of the 46th International Symposium on Computer Architecture; 2019; pp. 129–142. [Google Scholar]
- Wu, X.C.; Sherwood, T.; Chong, F.T.; Li, Y. Protecting page tables from rowhammer attacks using monotonic pointers in dram true-cells. In Proceedings of the Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems; 2019; pp. 645–657. [Google Scholar]
- Rashidi, S.; Jalili, M.; Sarbazi-Azad, H. A survey on pcm lifetime enhancement schemes. ACM Computing Surveys (CSUR) 2019, 52, 1–38. [Google Scholar] [CrossRef]
- Xia, F.; Jiang, D.J.; Xiong, J.; Sun, N.H. A survey of phase change memory systems. Journal of Computer Science and Technology 2015, 30, 121–144. [Google Scholar] [CrossRef]
- Boukhobza, J.; Rubini, S.; Chen, R.; Shao, Z. Emerging NVM: A survey on architectural integration and research challenges. ACM Transactions on Design Automation of Electronic Systems (TODAES) 2017, 23, 1–32. [Google Scholar] [CrossRef]
- Akinaga, H.; Shima, H. Resistive random access memory (ReRAM) based on metal oxides. Proceedings of the IEEE 2010, 98, 2237–2251. [Google Scholar] [CrossRef]
- Liu, H.; Ye, Y.; Liao, X.; Jin, H.; Zhang, Y.; Jiang, W.; He, B. Space-oblivious compression and wear leveling for non-volatile main memories. In Proceedings of the Proc. the 36th International Conference on Massive Storage Systems and Technology; 2020. [Google Scholar]
- Huang, K.; Mei, Y.; Huang, L. Quail: Using nvm write monitor to enable transparent wear-leveling. Journal of Systems Architecture 2020, 102, 101658. [Google Scholar] [CrossRef]
- Hakert, C.; Chen, K.H.; Genssler, P.R.; von der Brüggen, G.; Bauer, L.; Amrouch, H.; Chen, J.J.; Henkel, J. Softwear: Software-only in-memory wear-leveling for non-volatile main memory. arXiv preprint 2020, arXiv:2004.03244 2020. [Google Scholar]
- Xiao, C.; Cheng, L.; Zhang, L.; Liu, D.; Liu, W. Wear-aware memory management scheme for balancing lifetime and performance of multiple NVM slots. In Proceedings of the 2019 35th Symposium on Mass Storage Systems and Technologies (MSST); IEEE, 2019; pp. 148–160. [Google Scholar]
- Cho, S.; Lee, H. Flip-N-Write: A simple deterministic technique to improve PRAM write performance, energy and endurance. In Proceedings of the Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture; 2009; pp. 347–357. [Google Scholar]
- Jacobvitz, A.N.; Calderbank, R.; Sorin, D.J. Coset coding to extend the lifetime of memory. In Proceedings of the 2013 IEEE 19th International Symposium on High Performance Computer Architecture (HPCA); IEEE, 2013; pp. 222–233. [Google Scholar]
- Alameldeen, A. Frequent Pattern Compression: A Significance-Based Compression Scheme for L2 Caches. 2004. [Google Scholar]
- Pekhimenko, G.; Seshadri, V.; Mutlu, O.; Gibbons, P.B.; Kozuch, M.A.; Mowry, T.C. Base-delta-immediate compression: Practical data compression for on-chip caches. In Proceedings of the Proceedings of the 21st international conference on Parallel architectures and compilation techniques; 2012; pp. 377–388. [Google Scholar]
- Angerd, A.; Arelakis, A.; Spiliopoulos, V.; Sintorn, E.; Stenström, P. GBDI: Going beyond base-delta-immediate compression with global bases. In Proceedings of the 2022 IEEE International Symposium on High-Performance Computer Architecture (HPCA); IEEE, 2022; pp. 1115–1127. [Google Scholar]
- Lee, B.C.; Zhou, P.; Yang, J.; Zhang, Y.; Zhao, B.; Ipek, E.; Mutlu, O.; Burger, D. Phase-change technology and the future of main memory. IEEE micro 2010, 30, 143–143. [Google Scholar] [CrossRef]
- Mao, H.; Zhang, X.; Sun, G.; Shu, J. Protect non-volatile memory from wear-out attack based on timing difference of row buffer hit/miss. In Proceedings of the Design, Automation & Test in Europe Conference & Exhibition (DATE), 2017; IEEE, 2017; pp. 1623–1626. [Google Scholar]
- Qureshi, M.K.A.; Gurumurthi, S.; Rajendran, B. Phase change memory: From devices to systems; Morgan & Claypool Publishers, 2012; Vol. 18. [Google Scholar]
- Zuo, P.; Hua, Y. A write-friendly hashing scheme for non-volatile memory systems. In Proceedings of the Proc. MSST; 2017; pp. 1–10. [Google Scholar]
- Qureshi, M.K.; Srinivasan, V.; Rivers, J.A. Scalable high performance main memory system using phase-change memory technology. In Proceedings of the Proceedings of the 36th annual international symposium on Computer architecture; 2009; pp. 24–33. [Google Scholar]
- Yang, J.; Zhang, Y.; Gupta, R. Frequent value compression in data caches. In Proceedings of the Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture; 2000; pp. 258–265. [Google Scholar]
- Guo, Y.; Hua, Y.; Zuo, P. DFPC: A dynamic frequent pattern compression scheme in NVM-based main memory. In Proceedings of the 2018 Design, Automation & Test in Europe Conference & Exhibition (DATE); IEEE, 2018; pp. 1622–1627. [Google Scholar]
- Yang, B.D.; Lee, J.E.; Kim, J.S.; Cho, J.; Lee, S.Y.; Yu, B.G. A low power phase-change random access memory using a data-comparison write scheme. In Proceedings of the 2007 IEEE International Symposium on Circuits and Systems (ISCAS); IEEE, 2007; pp. 3014–3017. [Google Scholar]
- Palangappa, P.M.; Mohanram, K. Compex++ compression-expansion coding for energy, latency, and lifetime improvements in mlc/tlc nvms. ACM Transactions on Architecture and Code Optimization (TACO) 2017, 14, 1–30. [Google Scholar] [CrossRef]
- Lowe-Power, J.; Ahmad, A.M.; Akram, A.; Alian, M.; Amslinger, R.; Andreozzi, M.; Armejach, A.; Asmussen, N.; Beckmann, B.; Bharadwaj, S.; et al. The gem5 simulator: Version 20.0+. arXiv preprint 2020, arXiv:2007.03152 2020. [Google Scholar]
- Poremba, M.; Zhang, T.; Xie, Y. Nvmain 2.0: A user-friendly memory simulator to model (non-) volatile memory systems. IEEE Computer Architecture Letters 2015, 14, 140–143. [Google Scholar] [CrossRef]
- Xu, J.; Feng, D.; Hua, Y.; Tong, W.; Liu, J.; Li, C.; Xu, G.; Chen, Y. Adaptive granularity encoding for energy-efficient non-volatile main memory. In Proceedings of the Proceedings of the 56th Annual Design Automation Conference 2019; 2019; pp. 1–6. [Google Scholar]
- Xu, J.; Feng, D.; Hua, Y.; Tong, W.; Liu, J.; Li, C. Extending the lifetime of NVMs with compression. In Proceedings of the 2018 Design, Automation & Test in Europe Conference & Exhibition (DATE); IEEE, 2018; pp. 1604–1609. [Google Scholar]
- Dusser, J.; Piquet, T.; Seznec, A. Zero-content augmented caches. In Proceedings of the Proceedings of the 23rd international conference on Supercomputing; 2009; pp. 46–55. [Google Scholar]











| Prefix | Pattern Encoded | Example | Value | Encoded Size |
|---|---|---|---|---|
| 000 | Zero Run | 0x00000000 | 0x0 | 3 bits |
| 001 | 4-bit Sign Extended | 0x00000005 | 0x15 | 7 bits |
| 010 | 1-byte Sign Extended | 0xFFFFFFC7 | 0x2C7 | 11 bits |
| 011 | Half-word Sign Extended | 0x00002765 | 0x32765 | 19 bits |
| 100 | Half-word, padded with zero Half-word | 0xBB210000 | 0x4BB21 | 19 bits |
| 101 | Two Half-words, each a byte extended | 0x00360016 | 0x53616 | 19 bits |
| 110 | Word consisting of four repeated bytes | 0x10101010 | 0x610 | 11 bits |
| Processor and Cache | |
|---|---|
| Processor and Cache | CPU single-core x86-64 processor, 500 MHz |
| Private L1/L2 Cache | 32 KB/2048 KB |
| Memory (PCM-Based Memory) | |
| Memory (PCM-Based) | Capacity: 8 GB, 1 channel, 1 rank, 8 banks |
| Memory Controller | First-Ready-First-Come-First-Serve (FRFCFS) |
| Set Latency | 50 cycles for each word of 32 bits |
| Reset Latency | 25 cycles for each word of 32 bits |
| Read Latency | 53 cycles |
| Parameters of DCom | |
| Compression latency | 3 cycles |
| Decompression latency | 2 cycles |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).