Submitted:
08 September 2023
Posted:
11 September 2023
You are already at the latest version
Abstract
Keywords:
1. Introduction
2. Preliminaries
3. Approach
3.1. Proposed Methodology
3.2. Mesh Planning
| Algorithm 1 The mesh planning phase |
|
3.3. Mesh Synthesis
4. Experimental Results
| The original flow | The proposed methodology | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Benchmarks |
Grid (M*N) |
Wire width (3X-5X) |
Skew (ps) |
Power (mW) |
Runtime (h) |
Grid (M*N) |
Wire width (2X-6X) |
Skew (ps) |
Power (mW) |
Runtime (h) |
| s13207 | 7*7 | 3X | 8.3 | 49.3 | 1.2 | 6*6 | 4X | 7.7 | 32.4 | 0.25 |
| s15850 | 7*7 | 3X | 4.2 | 42.2 | 1.8 | 5*5 | 4X | 3.1 | 30.5 | 0.29 |
| s35932 | 14*14 | 3X | 19.3 | 137.2 | 2.2 | 10*10 | 4X | 8.5 | 77.3 | 0.36 |
| s38417 | 14*14 | 4X | 16.7 | 127.3 | 2.5 | 11*7 | 5X | 15.2 | 78.6 | 0.42 |
| s38584 | 11*11 | 4X | 15.2 | 118.3 | 1.9 | 8*6 | 5X | 13.9 | 78.2 | 0.38 |
| blob_merge | 20*20 | 4X | 22.1 | 253.7 | 3.5 | 18*14 | 5X | 20.3 | 247.8 | 0.65 |
| bmg | 85*85 | 4X | 27.5 | 425.6 | 5.3 | 64*54 | 5X | 25.9 | 382.1 | 1.38 |
| sha | 16*16 | 4X | 18.1 | 165.2 | 2.1 | 14*8 | 5X | 17.3 | 93.6 | 0.51 |
| stero2 | 120*120 | 4X | 56.4 | 586.1 | 4.2 | 86*72 | 6X | 46.5 | 421.6 | 1.67 |
| ucsb | 24*24 | 4X | 25.5 | 279.4 | 2.8 | 19*16 | 5X | 21.7 | 266.3 | 0.58 |
| 1664 microprocessor | 155*135 | 4X | 78.4 | 956.3 | 15.1 | 122*96 | 6X | 66.1 | 718.6 | 3.2 |
| 8051 core | 145*125 | 4X | 69.4 | 853.1 | 14.2 | 118*92 | 6X | 53.5 | 633.4 | 3.4 |
| microprocessor za208 | 140*130 | 4X | 62.3 | 821.1 | 13.9 | 101*82 | 5X | 49.3 | 650.6 | 3.15 |
| DarkRISCV | 170*150 | 5X | 87.9 | 962.7 | 21.5 | 158*126 | 6X | 83.6 | 702.7 | 4.78 |
| Neptune Core | 185*140 | 5X | 92.4 | 1075.1 | 26.0 | 172*120 | 6X | 89.7 | 818.2 | 5.61 |
| IP Module | APE | |
|---|---|---|
| Experiments | C | O |
| Clock tree levels | 36 | 23 |
| Total number of buffers | 8952 | 3478 |
| WNS(ns) | -0.36 | -0.12 |
| TNS(ns) | -925.7 | -425.6 |
| Clock network latency(ns) | 1.85 | 1.64 |
| Power consumption(mW) | 1036.2 | 682.3 |
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Restle, P.J.; McNamara, T.G.; Webber, D.A.; Camporese, P.J.; Eng, K.F.; Jenkins, K.A.; Allen, D.H.; Rohn, M.J.; Quaranta, M.P.; Boerstler, D.W.; et al. A clock distribution network for microprocessors. IEEE Journal of Solid-State Circuits 2001, 36, 792–799. [Google Scholar] [CrossRef]
- Maaz, M.; Bayoumi, M. A non-zero clock skew scheduling algorithm for high speed clock distribution network. In Proceedings of the 1999 IEEE International Symposium on Circuits and Systems (ISCAS), Vol. 6; 1999; pp. 382–385. [Google Scholar] [CrossRef]
- Andrew, B. Kahng, Jens Lienig, I.L.M.J.H. VLSI Physical Design: From Graph Partitioning to Timing Closure, 2022. [Google Scholar]
- Kanupriya Gulati, S.P.K. Hardware Acceleration of EDA Algorithms, 1 ed.; Springer New York, NY, 2010.
- Rajanish, K. Kamat, Santosh A. Shinde, P.K.G.H.G. Harnessing VLSI System Design with EDA Tools, 1 ed. Springer Dordrecht. 2011. [Google Scholar]
- Restle, P.; Carter, C.; Eckhardt, J.; Krauter, B.; McCredie, B.; Jenkins, K.; Weger, A.; Mule, A. The clock distribution of the POWER4 microprocessor. In Proceedings of the 2002 IEEE International Solid-State Circuits Conference. Digest of Technical Papers (Cat. No.02CH37315), Vol. 2; 2002; pp. 108–424. [Google Scholar] [CrossRef]
- Restle, P.; Franch, R.; James, N.; Huott, W.; Skergan, T.; Wilson, S.; Schwartz, N.; Clabes, J. Timing uncertainty measurements on the Power5 microprocessor. In Proceedings of the 2004 IEEE International Solid-State Circuits Conference (IEEE Cat. No.04CH37519); 2004. Vol.1.. pp. 354–355. [Google Scholar] [CrossRef]
- Thomson, M.; Restle, P.; James, N. A 5GHz Duty-Cycle Correcting Clock Distribution Network for the POWER6 Microprocessor. In Proceedings of the 2006 IEEE International Solid-State Circuits Conference - Digest of Technical Papers; 2006; pp. 1522–1529. [Google Scholar] [CrossRef]
- Wendel, D.F.; Kalla, R.; Warnock, J.; Cargnoni, R.; Chu, S.G.; Clabes, J.G.; Dreps, D.; Hrusecky, D.; Friedrich, J.; Islam, S.; et al. POWER7™, a Highly Parallel, Scalable Multi-Core High End Server Processor. IEEE Journal of Solid-State Circuits 2011, 46, 145–161. [Google Scholar] [CrossRef]
- Restle, P.; Shan, D.; Hogenmiller, D.; Kim, Y.; Drake, A.; Hibbeler, J.; Bucelot, T.; Still, G.; Jenkins, K.; Friedrich, J. 53 Wide-frequency-range resonant clock with on-the-fly mode changing for the POWER8TM microprocessor. In Proceedings of the 2014 IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC); 2014; pp. 100–101. [Google Scholar] [CrossRef]
- Heald, R.; Aingaran, K.; Amir, C.; Ang, M.; Boland, M.; Das, A.; Dixit, P.; Gouldsberry, G.; Hart, J.; Horel, T.; et al. Implementation of a 3rd-generation SPARC V9 64 b microprocessor. In Proceedings of the 2000 IEEE International Solid-State Circuits Conference. Digest of Technical Papers (Cat. No. 00CH37056). IEEE; 2000; pp. 412–413. [Google Scholar]
- Wendell, D.; Lin, J.; Kaushik, P.; Seshadri, S.; Wang, A.; Sundararaman, V.; Wang, P.; McIntyre, H.; Kim, S.; Hsu, W.; Park, H.; Levinsky, G.; Lu, J.; Chirania, M.; Heald, R.; Lazar, P. A 4MB on-chip L2 cache for a 90nm 1. In 6GHz 64b SPARC microprocessor. In Proceedings of the 2004 IEEE International Solid-State Circuits Conference (IEEE Cat. No.04CH37519); 2004. Vol.1. pp. 66–513. [Google Scholar] [CrossRef]
- Northrop, G.; Averill, R.; Barkley, K.; Carey, S.; Chan, Y.; Chan, Y.; Check, M.; Hoffman, D.; Huott, W.; Krumm, B. ; others. In 609 MHz G5 S/399 microprocessor. In Proceedings of the 1999 IEEE International Solid-State Circuits Conference. Digest of Technical Papers. ISSCC. First Edition (Cat. No. 99CH36278). IEEE; 1999; pp. 88–89. [Google Scholar]
- Rajaram, A.; Pan, D.Z. MeshWorks: An efficient framework for planning, synthesis and optimization of clock mesh networks. In Proceedings of the 2008 Asia and South Pacific Design Automation Conference. IEEE; 2008; pp. 250–257. [Google Scholar]
- Rajaram, A.; Pan, D.Z. MeshWorks: A Comprehensive Framework for Optimized Clock Mesh Network Synthesis. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 2010, 29, 1945–1958. [Google Scholar] [CrossRef]
- Cho, M.; Pan, D.Z.; Puri, R. Novel binary linear programming for high performance clock mesh synthesis. In Proceedings of the 2010 IEEE/ACM International Conference on Computer-Aided Design (ICCAD). IEEE; 2010; pp. 438–443. [Google Scholar]
- Abdelhadi, A.; Ginosar, R.; Kolodny, A.; Friedman, E.G. Timing-driven variation-aware nonuniform clock mesh synthesis. In Proceedings of the 20th symposium on Great lakes symposium on VLSI; 2010; pp. 15–20. [Google Scholar]
- Guthaus, M.R.; Wilke, G.; Reis, R. Non-uniform clock mesh optimization with linear programming buffer insertion. In Proceedings of the 47th Design Automation Conference; 2010; pp. 74–79. [Google Scholar]
- Sitik, C.; Taskin, B. Multi-voltage domain clock mesh design. In Proceedings of the 2012 IEEE 30th International Conference on Computer Design (ICCD). IEEE; 2012; pp. 201–206. [Google Scholar]
- Lu, J.; Mao, X.; Taskin, B. Timing slack aware incremental register placement with non-uniform grid generation for clock mesh synthesis. In Proceedings of the 2011 international symposium on Physical design; 2011; pp. 131–138. [Google Scholar]
- Teng, Y.; Taskin, B. Clock mesh synthesis method using the Earth Mover’s Distance under transformations. In Proceedings of the 2012 IEEE 30th International Conference on Computer Design (ICCD); 2012; pp. 121–126. [Google Scholar] [CrossRef]
- Lu, J.; Mao, X.; Taskin, B. Clock mesh synthesis with gated local trees and activity driven register clustering. In Proceedings of the 2012 IEEE/ACM International Conference on Computer-Aided Design (ICCAD). IEEE; 2012; pp. 691–697. [Google Scholar]
- Liu, W.H.; Li, Y.L.; Chen, H.C. Minimizing clock latency range in robust clock tree synthesis. In Proceedings of the 2010 15th Asia and South Pacific Design Automation Conference (ASP-DAC). IEEE; 2010; pp. 389–394. [Google Scholar]
- Ewetz, R.; Tan, C.Y.; Koh, C.K. Construction of latency-bounded clock trees. In Proceedings of the 2016 on International Symposium on Physical Design; 2016; pp. 81–88. [Google Scholar]
- Chen, H.; Yeh, C.; Wilke, G.; Reddy, S.; Nguyen, H.; Walker, W.; Murgai, R. A sliding window scheme for accurate clock mesh analysis. In Proceedings of the ICCAD-2005. IEEE/ACM International Conference on Computer-Aided Design, 2005, IEEE; 2005. pp. 939–946. [Google Scholar]
- Sze, C.N. ISPD 2010 high performance clock network synthesis contest: Benchmark suite and results. In Proceedings of the 19th international symposium on Physical design; 2010; pp. 143–143. [Google Scholar]
- Guthaus, M.R.; Hu, X.; Wilke, G.; Flach, G.; Reis, R. High-performance clock mesh optimization. ACM Transactions on Design Automation of Electronic Systems (TODAES) 2012, 17, 1–17. [Google Scholar] [CrossRef]
- Liu, M.; Zhang, Z.; Sun, W.; Wang, D. A novel obstacle-aware multiple fan-out symmetrical clock tree synthesis. IEICE Electronics Express 2017, 14, 20170935–20170935. [Google Scholar] [CrossRef]
- Zhang, Z.; Liu, M.; Liu, Z.; Du, X.; Xie, S.; Ma, H.; Ding, G.; Ren, W.; Zhou, F.; Sun, W.; et al. Progress in a novel architecture for high performance processing. Japanese Journal of Applied Physics 2018, 57, 04FA03. [Google Scholar] [CrossRef]
- Liu, M. A co-design method of customized ISA design space exploration and fixed-point library construction for RISC-V dedicated processor. IEICE Electronics Express 2022, 19, 20220244–20220244. [Google Scholar] [CrossRef]












| BM. | Sinks | LAT. (ps) |
Grid (M*N) |
SC | Area RD. |
Run.0 (min) |
Run.1 (min) |
|---|---|---|---|---|---|---|---|
| 01 | 1107 | 20 | 27*18 | 12.5 | 8.3% | 56 | 27 |
| 02 | 2249 | 20 | 38*21 | 27.9 | 12.5% | 101 | 50 |
| 03 | 1200 | 20 | 16*12 | 16.2 | 9.7% | 9 | 4 |
| 04 | 1845 | 20 | 17*18 | 25.1 | 10.3% | 31 | 14 |
| 05 | 1016 | 10 | 14*12 | 8.3 | 9.2% | 18 | 8 |
| 06 | 981 | 10 | 13*12 | 7.2 | 7.8% | 15 | 6 |
| 07 | 1915 | 20 | 22*13 | 24.6 | 11.2% | 27 | 13 |
| 08 | 1134 | 10 | 14*15 | 9.1 | 8.9 % | 8 | 4 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).