Feature Distribution-based Knowledge Distillation for Deep Neural Networks

Hyeonseok Hong, Hyun Kim

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

3 Scopus citations

Abstract

In recent years. various compression methods and compact models have been actively proposed to solve the significant computational costs accompanied by the achievement of high accuracy in deep learning. In particular, the knowledge distillation (KD) technique, which achieves a significant network compression effect by using the information of large teacher networks for the training of small student networks, is receiving a lot of attention because it has high scalability and reusability compared to the development of a compact model with a new structure. In this paper, we propose feature distribution-based knowledge distillation (FDKD) that effectively transfers semantic information using only the distribution information of feature maps obtained by a simple operation. Experimental results show that the proposed method improves the accuracy by up to 5.26% and 1.38%, respectively, compared to the baseline (i.e., Vanilla) and the existing KD scheme.

Original languageEnglish
Title of host publicationProceedings - International SoC Design Conference 2022, ISOCC 2022
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages75-76
Number of pages2
ISBN (Electronic)9781665459716
DOIs
StatePublished - 2022
Event19th International System-on-Chip Design Conference, ISOCC 2022 - Gangneung-si, Korea, Republic of
Duration: 19 Oct 202222 Oct 2022

Publication series

NameProceedings - International SoC Design Conference 2022, ISOCC 2022

Conference

Conference19th International System-on-Chip Design Conference, ISOCC 2022
Country/TerritoryKorea, Republic of
CityGangneung-si
Period19/10/2222/10/22

Keywords

  • classification
  • deep neural network
  • feature distribution
  • knowledge distillation
  • knowledge transfer

Fingerprint

Dive into the research topics of 'Feature Distribution-based Knowledge Distillation for Deep Neural Networks'. Together they form a unique fingerprint.

Cite this