TY - GEN
T1 - Feature Map-Aware Activation Quantization for Low-bit Neural Networks
AU - Lee, Seungjin
AU - Kim, Hyun
N1 - Publisher Copyright:
© 2021 IEEE.
PY - 2021/6/27
Y1 - 2021/6/27
N2 - Quantization, the most popular deep neural network (DNN) compression method, can reduce the computational complexity and save a lot of memory resources by converting the existing 32-bit floating point values to low-bit integer point values. However, as DNNs are widely used in mobile and edge devices, which have relatively less hardware resources, there are demands for more aggressive quantization methods. To meet these needs, this paper introduces a dedicated method that divides activation maps of DNNs into several regions according to the activation size and quantizes them to 4-bit by setting scale factors adaptively for each region. As a result of applying the proposed method to the backbone of YOLACT, a representative instance segmentation model, the proposed method achieves approximately 2% increase in both box and mask mAPs compared to the naive 4-bit activation quantization.
AB - Quantization, the most popular deep neural network (DNN) compression method, can reduce the computational complexity and save a lot of memory resources by converting the existing 32-bit floating point values to low-bit integer point values. However, as DNNs are widely used in mobile and edge devices, which have relatively less hardware resources, there are demands for more aggressive quantization methods. To meet these needs, this paper introduces a dedicated method that divides activation maps of DNNs into several regions according to the activation size and quantizes them to 4-bit by setting scale factors adaptively for each region. As a result of applying the proposed method to the backbone of YOLACT, a representative instance segmentation model, the proposed method achieves approximately 2% increase in both box and mask mAPs compared to the naive 4-bit activation quantization.
KW - Deep learning
KW - Low-precision
KW - Quantization
KW - ResNet50
KW - YOLACT
UR - http://www.scopus.com/inward/record.url?scp=85113986255&partnerID=8YFLogxK
U2 - 10.1109/ITC-CSCC52171.2021.9501414
DO - 10.1109/ITC-CSCC52171.2021.9501414
M3 - Conference contribution
AN - SCOPUS:85113986255
T3 - 2021 36th International Technical Conference on Circuits/Systems, Computers and Communications, ITC-CSCC 2021
BT - 2021 36th International Technical Conference on Circuits/Systems, Computers and Communications, ITC-CSCC 2021
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 36th International Technical Conference on Circuits/Systems, Computers and Communications, ITC-CSCC 2021
Y2 - 27 June 2021 through 30 June 2021
ER -