TY - JOUR
T1 - Zero-Centered Fixed-Point Quantization with Iterative Retraining for Deep Convolutional Neural Network-Based Object Detectors
AU - Kim, Sungrae
AU - Kim, Hyun
N1 - Publisher Copyright:
© 2013 IEEE.
PY - 2021
Y1 - 2021
N2 - In the field of object detection, deep learning has greatly improved accuracy compared to previous algorithms and has been used widely in recent years. However, object detection using deep learning requires many hardware (HW) resources due to the huge computations for high performance, making it very difficult to run real-time on embedded platforms. Therefore, various compression methods have been studied to solve this problem. In particular, quantization methods greatly reduce the computational burden of deep learning by reducing the number of bits used for weights and activation functions in deep learning. However, most of these existing studies targeted only object classification and cannot be applied to object detection. Furthermore, most of the existing quantization studies are based on floating-point operations, which requires additional effort when implementing HW accelerators. This paper proposes an HW-friendly fixed-point-based quantization method that can also be applied to object detection. In the proposed method, the center of the weight distribution is adjusted to zero by subtracting the mean of weight parameters before quantization, and the retraining process is iteratively applied to minimize the accuracy drop caused by quantization. Furthermore, while applying the proposed method to object detection, performance degradation is minimized by considering the minimum and maximum values of weight parameters of deep learning networks. When applying the proposed quantization method to representative one-stage object detectors, You Only Look Once v3 and v4 (YOLOv3 and YOLOv4), detection accuracy similar to the original networks (i.e., YOLOv3 and YOLOv4) with a single-precision floating-point format (32-bit) is maintained despite expressing weights with only about 20% of the bits compared to a single-precision floating-point format in COCO dataset.
AB - In the field of object detection, deep learning has greatly improved accuracy compared to previous algorithms and has been used widely in recent years. However, object detection using deep learning requires many hardware (HW) resources due to the huge computations for high performance, making it very difficult to run real-time on embedded platforms. Therefore, various compression methods have been studied to solve this problem. In particular, quantization methods greatly reduce the computational burden of deep learning by reducing the number of bits used for weights and activation functions in deep learning. However, most of these existing studies targeted only object classification and cannot be applied to object detection. Furthermore, most of the existing quantization studies are based on floating-point operations, which requires additional effort when implementing HW accelerators. This paper proposes an HW-friendly fixed-point-based quantization method that can also be applied to object detection. In the proposed method, the center of the weight distribution is adjusted to zero by subtracting the mean of weight parameters before quantization, and the retraining process is iteratively applied to minimize the accuracy drop caused by quantization. Furthermore, while applying the proposed method to object detection, performance degradation is minimized by considering the minimum and maximum values of weight parameters of deep learning networks. When applying the proposed quantization method to representative one-stage object detectors, You Only Look Once v3 and v4 (YOLOv3 and YOLOv4), detection accuracy similar to the original networks (i.e., YOLOv3 and YOLOv4) with a single-precision floating-point format (32-bit) is maintained despite expressing weights with only about 20% of the bits compared to a single-precision floating-point format in COCO dataset.
KW - Convolutional neural network
KW - deep neural network
KW - fixed-point quantization
KW - network compression
KW - object detector
KW - YOLOv3
KW - YOLOv4
UR - https://www.scopus.com/pages/publications/85100499208
U2 - 10.1109/access.2021.3054879
DO - 10.1109/access.2021.3054879
M3 - Article
AN - SCOPUS:85100499208
SN - 2169-3536
VL - 9
SP - 20828
EP - 20839
JO - IEEE Access
JF - IEEE Access
M1 - 9336635
ER -