TY - GEN
T1 - Hardware-Friendly Logarithmic Quantization with Mixed-Precision for MobileNetV2
AU - Choi, Dahun
AU - Kim, Hyun
N1 - Publisher Copyright:
© 2022 IEEE.
PY - 2022
Y1 - 2022
N2 - In a variety of computer vision applications, convolutional neural networks (CNNs) have achieved excellent accuracy. However, in order for a CNN to operate on embedded platforms such as mobile devices, hardware resources and power consumption must be reduced. Accordingly, research involving the application of low-precision quantization to lightweight networks, such as MobileNet, has attracted considerable attention. In particular, compared to linear quantization, logarithmic quantization can significantly reduce hardware resources by processing multiplication operations as addition operations when implementing a hardware accelerator. In this study, we propose a novel logarithmic weight quantization considering the characteristics of MobileNetV2, which is known to be notoriously difficult to quantize, and a mixed-precision quantization that minimizes accuracy loss by training the distribution range using the trainable parameter α, Experimental results show that the proposed method achieves accuracies greater than 1.47% and 2% on the CIFAR-10 and Tiny-ImageNet datasets, respectively, compared to the general log-scale quantization methods. As a result, the proposed method achieves a significant hardware resource reduction with only a slight degradation in performance when compared to the full precision (i.e., FP32), and achieves an additional power reduction effect of about 48% compared to linear scale quantization at the same precision.
AB - In a variety of computer vision applications, convolutional neural networks (CNNs) have achieved excellent accuracy. However, in order for a CNN to operate on embedded platforms such as mobile devices, hardware resources and power consumption must be reduced. Accordingly, research involving the application of low-precision quantization to lightweight networks, such as MobileNet, has attracted considerable attention. In particular, compared to linear quantization, logarithmic quantization can significantly reduce hardware resources by processing multiplication operations as addition operations when implementing a hardware accelerator. In this study, we propose a novel logarithmic weight quantization considering the characteristics of MobileNetV2, which is known to be notoriously difficult to quantize, and a mixed-precision quantization that minimizes accuracy loss by training the distribution range using the trainable parameter α, Experimental results show that the proposed method achieves accuracies greater than 1.47% and 2% on the CIFAR-10 and Tiny-ImageNet datasets, respectively, compared to the general log-scale quantization methods. As a result, the proposed method achieves a significant hardware resource reduction with only a slight degradation in performance when compared to the full precision (i.e., FP32), and achieves an additional power reduction effect of about 48% compared to linear scale quantization at the same precision.
KW - Convolutional Neural Network
KW - Deep learning
KW - logarithmic quantization
KW - MobileNetV2
UR - http://www.scopus.com/inward/record.url?scp=85139047578&partnerID=8YFLogxK
U2 - 10.1109/AICAS54282.2022.9869994
DO - 10.1109/AICAS54282.2022.9869994
M3 - Conference contribution
AN - SCOPUS:85139047578
T3 - Proceeding - IEEE International Conference on Artificial Intelligence Circuits and Systems, AICAS 2022
SP - 348
EP - 351
BT - Proceeding - IEEE International Conference on Artificial Intelligence Circuits and Systems, AICAS 2022
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 4th IEEE International Conference on Artificial Intelligence Circuits and Systems, AICAS 2022
Y2 - 13 June 2022 through 15 June 2022
ER -