GASQ: Hardware-Friendly Gradient Distribution-Aware Split Quantization for Low-Bit CNN Training

Sangbeom Jeong, Kwanghyun Koo, Hyun Kim

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

As the demand for personalized AI models continues to rise and the importance of privacy protection grows, there has been increasing interest in efficiently training convolutional neural networks (CNNs) on mobile and edge devices. Since backward propagation (BP) requires significantly more computational resources and memory usage than forward propagation, low-bit quantization presents greater potential for improving the efficiency of CNN training. However, the variability and specificity of gradient distribution during BP make gradient quantization particularly challenging. Existing studies attempt to mitigate this issue through additional computations, but they often lead to increased hardware complexity. To address this, we propose a hardware-efficient INT8 quantization method, gradient distribution-aware split quantization (GASQ), which is robust to gradient quantization errors. GASQ employs distinct scale factors for small and large magnitude gradients, effectively capturing the gradient distribution, which is predominantly centered around zero yet spans a broad range. This approach maintains low hardware complexity while achieving minimal quantization error. The proposed method demonstrates an average 0.27% performance improvement over full-precision models on the ImageNet dataset for classification tasks.

Original languageEnglish
Title of host publication2025 International Conference on Electronics, Information, and Communication, ICEIC 2025
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9798331510756
DOIs
StatePublished - 2025
Event2025 International Conference on Electronics, Information, and Communication, ICEIC 2025 - Osaka, Japan
Duration: 19 Jan 202522 Jan 2025

Publication series

Name2025 International Conference on Electronics, Information, and Communication, ICEIC 2025

Conference

Conference2025 International Conference on Electronics, Information, and Communication, ICEIC 2025
Country/TerritoryJapan
CityOsaka
Period19/01/2522/01/25

Keywords

  • Convolutional neural networks
  • gradient quantization
  • low-bit training
  • on-device AI

Fingerprint

Dive into the research topics of 'GASQ: Hardware-Friendly Gradient Distribution-Aware Split Quantization for Low-Bit CNN Training'. Together they form a unique fingerprint.

Cite this