Preprint Article Version 1 Preserved in Portico This version is not peer-reviewed

Research on Convolutional Neural Network Inference Acceleration and Performance Optimization for Edge Intelligence

Version 1 : Received: 8 December 2023 / Approved: 8 December 2023 / Online: 11 December 2023 (04:31:51 CET)

A peer-reviewed article of this Preprint also exists.

Liang, Y.; Tan, J.; Xie, Z.; Chen, Z.; Lin, D.; Yang, Z. Research on Convolutional Neural Network Inference Acceleration and Performance Optimization for Edge Intelligence. Sensors 2024, 24, 240. Liang, Y.; Tan, J.; Xie, Z.; Chen, Z.; Lin, D.; Yang, Z. Research on Convolutional Neural Network Inference Acceleration and Performance Optimization for Edge Intelligence. Sensors 2024, 24, 240.

Abstract

In recent years, Edge Intelligence (EI) has emerged, combining edge computing with AI, specifically deep learning, to run AI algorithms directly on edge devices. In practical applications, EI faces challenges related to computational power, power consumption, size, and cost, with the primary challenge being the trade-off between computational power and power consumption. This has rendered traditional computing platforms unsustainable, making heterogeneous parallel computing platforms a crucial pathway for implementing EI. In our research, we leveraged the Xilinx Zynq 7000 heterogeneous computing platform, employed High-Level Synthesis (HLS) for design, and implemented two different accelerators for LeNet-5 using loop unrolling and pipelining optimization techniques. The experimental results show that when running at a clock speed of 100 MHz, the PIPELINE accelerator, compared to the UNROLL accelerator, experiences an 8.09% increase in power consumption but speeds up by 14.972 times, making the PIPELINE accelerator superior in performance. Compared to the CPU, the PIPELINE accelerator reduces power consumption by 91.37% and speeds up by 70.387 times, while compared to the GPU, it reduces power consumption by 93.35%.This study provides two different optimization schemes for edge intelligence applications through design and experimentation, and demonstrates the impact of different quantization methods on FPGA resource consumption. These experimental results can provide a reference for practical applications, thereby providing a reference hardware acceleration scheme for edge intelligence applications.

Keywords

FPGA; HLS; Edge Intelligence; deep learning; heterogeneous computing

Subject

Computer Science and Mathematics, Artificial Intelligence and Machine Learning

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.

Leave a public comment
Send a private comment to the author(s)
* All users must log in before leaving a comment
Views 0
Downloads 0
Comments 0
Metrics 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.