TY - GEN
T1 - Feature Distribution-based Knowledge Distillation for Deep Neural Networks
AU - Hong, Hyeonseok
AU - Kim, Hyun
N1 - Publisher Copyright:
© 2022 IEEE.
PY - 2022
Y1 - 2022
N2 - In recent years. various compression methods and compact models have been actively proposed to solve the significant computational costs accompanied by the achievement of high accuracy in deep learning. In particular, the knowledge distillation (KD) technique, which achieves a significant network compression effect by using the information of large teacher networks for the training of small student networks, is receiving a lot of attention because it has high scalability and reusability compared to the development of a compact model with a new structure. In this paper, we propose feature distribution-based knowledge distillation (FDKD) that effectively transfers semantic information using only the distribution information of feature maps obtained by a simple operation. Experimental results show that the proposed method improves the accuracy by up to 5.26% and 1.38%, respectively, compared to the baseline (i.e., Vanilla) and the existing KD scheme.
AB - In recent years. various compression methods and compact models have been actively proposed to solve the significant computational costs accompanied by the achievement of high accuracy in deep learning. In particular, the knowledge distillation (KD) technique, which achieves a significant network compression effect by using the information of large teacher networks for the training of small student networks, is receiving a lot of attention because it has high scalability and reusability compared to the development of a compact model with a new structure. In this paper, we propose feature distribution-based knowledge distillation (FDKD) that effectively transfers semantic information using only the distribution information of feature maps obtained by a simple operation. Experimental results show that the proposed method improves the accuracy by up to 5.26% and 1.38%, respectively, compared to the baseline (i.e., Vanilla) and the existing KD scheme.
KW - classification
KW - deep neural network
KW - feature distribution
KW - knowledge distillation
KW - knowledge transfer
UR - http://www.scopus.com/inward/record.url?scp=85148415342&partnerID=8YFLogxK
U2 - 10.1109/ISOCC56007.2022.10031412
DO - 10.1109/ISOCC56007.2022.10031412
M3 - Conference contribution
AN - SCOPUS:85148415342
T3 - Proceedings - International SoC Design Conference 2022, ISOCC 2022
SP - 75
EP - 76
BT - Proceedings - International SoC Design Conference 2022, ISOCC 2022
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 19th International System-on-Chip Design Conference, ISOCC 2022
Y2 - 19 October 2022 through 22 October 2022
ER -