For Convolutional Neural Network (CNN), Depthwise Separable CNN (DSCNN) is a preferred architecture for Application Specific Integrated Circuit (ASIC) implementation on edge devices. It can benefit from a multi-mode approximate multiplier proposed in this work. The proposed approximate multiplier uses two 4-bit multiplication operations to implement a 12-bit multiplication operation by reusing the same multiplier array. With this approximate multiplier, sequential multiplication operations are pipelined in a modified DSCNN to fully utilize the PE array in the convolutional layer. This Approximate (A-DSCNN) was implemented on TSMC 40-nm CMOS process with a supply voltage of 0.9 V. At the clock frequency of 200 MHz, the design achieves 4.78 GOPs/mW while occupying 1.24 mm x 1.24 mm silicon area. Compared to conventional DSCNN implemented in a similar process node, the chip area and power consumption were reduced by 53% and 25%, while the throughput was improved by 17%.
Engineering, Electrical and Electronic Engineering
Copyright:
This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.