Hardware-Friendly Quantization via Outlier Scaling in Convolution-Attention-Based Hybrid Networks

Nam Joon Kim, Hyun Kim

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Hybrid networks that combine convolution and attention have achieved state-of-the-art performance in computer vision tasks. Quantization is a promising compression method to efficiently utilize these hybrid networks in resource-constrained consumer electronics like IoT devices. However, although many hybrid networks have been proposed, research on quantization dedicated to hybrid networks has not yet been actively conducted. To bridge this gap, we propose a novel hardware-friendly post-training quantization method. Initially, we observe significant outliers in the bottleneck blocks of hybrid networks, which result in severe accuracy degradation due to quantization. To effectively address these outliers, we propose not only a novel outlier scaling method but also an objective function-based power-of-two approximation method that replaces conventional floating-point multiplication with hardware-friendly shift operations. To demonstrate the effectiveness of the proposed method, experiments were conducted on the ImageNet-1k dataset using a representative hybrid network, MobileViT. Our proposed method significantly mitigated the accuracy drop with a small parameter increase at the same model size compared to existing quantization methods.

Original languageEnglish
Title of host publication2025 IEEE International Conference on Consumer Electronics, ICCE 2025
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9798331521165
DOIs
StatePublished - 2025
Event2025 IEEE International Conference on Consumer Electronics, ICCE 2025 - Las Vegas, United States
Duration: 11 Jan 202514 Jan 2025

Publication series

NameDigest of Technical Papers - IEEE International Conference on Consumer Electronics
ISSN (Print)0747-668X
ISSN (Electronic)2159-1423

Conference

Conference2025 IEEE International Conference on Consumer Electronics, ICCE 2025
Country/TerritoryUnited States
CityLas Vegas
Period11/01/2514/01/25

Keywords

  • Convolution neural network
  • Hybrid network
  • IoT devices
  • Quantization
  • Transformer

Fingerprint

Dive into the research topics of 'Hardware-Friendly Quantization via Outlier Scaling in Convolution-Attention-Based Hybrid Networks'. Together they form a unique fingerprint.

Cite this