ARC: Adaptive Rounding and Clipping Considering Gradient Distribution for Deep Convolutional Neural Network Training

Da Hun Choi, Hyun Kim

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

2 Scopus citations

Abstract

In convolution neural networks (CNNs), quantization is an effective compression method that can conserve hardware resources in convolution operations, which account for the majority of computations, with lower bits. Most quantization studies focused on weight and activation parameters. However, gradient quantization, which is the core of quantization research for CNN training, has a significant impact on network training even with small changes in gradient, making it difficult to achieve excellent accuracy. Although previous works for gradient quantization achieved high accuracy based on stochastic rounding (SR), there are issues of high latency in generating random numbers and difficulty with register transfer level (RTL)-based hardware design. Additionally, the search for a clipping value based on quantization error is effective in the initial training but becomes inadequate after model convergence as the quantization error decreases. In this paper, we address the limitations of SR through an approach based on deterministic rounding, specifically rounding toward zero (RTZ). Additionally, to determine the outliers in a distribution, we search for a suitable clipping value based on the z-score, that would be appropriate even if the network converges. Experimental results show that the proposed method achieves higher accuracy in various vision tasks, such as ResNet, YOLOv5, and YOLACT, and offers robust compatibility. The proposed quantizer was verified through RTL, achieving an accuracy similar to that of SR using resources comparable to nearest rounding. Moreover, when measuring latency on the CPU, the proposed method achieved 43% less latency than SR.

Original languageEnglish
Title of host publicationISCAS 2024 - IEEE International Symposium on Circuits and Systems
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9798350330991
DOIs
StatePublished - 2024
Event2024 IEEE International Symposium on Circuits and Systems, ISCAS 2024 - Singapore, Singapore
Duration: 19 May 202422 May 2024

Publication series

NameProceedings - IEEE International Symposium on Circuits and Systems
ISSN (Print)0271-4310

Conference

Conference2024 IEEE International Symposium on Circuits and Systems, ISCAS 2024
Country/TerritorySingapore
CitySingapore
Period19/05/2422/05/24

Keywords

  • Convolutional neural network
  • Deep learning
  • Gradient quantization
  • Low-power design

Fingerprint

Dive into the research topics of 'ARC: Adaptive Rounding and Clipping Considering Gradient Distribution for Deep Convolutional Neural Network Training'. Together they form a unique fingerprint.

Cite this