Preprint Article Version 1 Preserved in Portico This version is not peer-reviewed

ECHO: Energy-Efficient Computation Harnessing Online Arithmetic – a MSDF-Based Accelerator for DNN Inference

Version 1 : Received: 8 April 2024 / Approved: 8 April 2024 / Online: 8 April 2024 (14:46:36 CEST)

How to cite: Ibrahim, M.S.; Usman, M.; Lee, J. ECHO: Energy-Efficient Computation Harnessing Online Arithmetic – a MSDF-Based Accelerator for DNN Inference. Preprints 2024, 2024040561. https://doi.org/10.20944/preprints202404.0561.v1 Ibrahim, M.S.; Usman, M.; Lee, J. ECHO: Energy-Efficient Computation Harnessing Online Arithmetic – a MSDF-Based Accelerator for DNN Inference. Preprints 2024, 2024040561. https://doi.org/10.20944/preprints202404.0561.v1

Abstract

Deep Neural Network (DNN) inference demands substantial computing power, resulting in significant energy consumption. A large number of negative output activations in convolution layers are rendered zero due to the invocation of the ReLU activation function. This results in a substantial number of unnecessary computations that consume significant amounts of energy. This paper presents ECHO: Energy-efficient Computation Harnessing Onilne Arithmetic - A MSDF-based accelerator for DNN inference, designed for computation pruning, utilizing an unconventional arithmetic paradigm known as online/ most-significant digit first (MSDF) arithmetic which performs computations in a digit-serial manner. The MSDF digital serial computation of online arithmetic enables overlapped computation of successive operations leading to substantial performance improvements. The online arithmetic, coupled with a negative output detection scheme, facilitates early and precise recognition of negative outputs. This, in turn, allows for the timely termination of unnecessary computations, resulting in a reduction of energy consumption. The implemented design has been realized on the Xilinx Virtex-7 VU3P FPGA and subjected to a comprehensive evaluation through a rigorous comparative analysis involving widely used performance metrics. Experimental results demonstrate promising power and throughput improvements compared to contemporary methods. In particular, the proposed design achieved an average improvement in power consumption of up to 81%, 82.9%, and 40.6% for VGG-16, ResNet-18, and ResNet-50 workloads compared to the conventional bit-serial design, respectively. Furthermore, significant average speedups of 2.39×, 2.6×, and 2.42 were observed when comparing the proposed design to conventional bit-serial designs for VGG-16, ResNet-18, and ResNet-50 models respectively.

Keywords

Computation pruning; early negative detection; CNN acceleration; convolution neural network; most-significant-digit first; online arithmetic.

Subject

Computer Science and Mathematics, Hardware and Architecture

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.

Leave a public comment
Send a private comment to the author(s)
* All users must log in before leaving a comment
Views 0
Downloads 0
Comments 0
Metrics 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.