VFT: A versatile fine-tuning scheme based on feature distribution-aware knowledge distillation for lightweight convolutional neural networks

Research output: Contribution to journalArticlepeer-review

Abstract

Various network compression techniques, such as pruning and quantization, are being actively researched in order to lighten convolutional neural networks (CNNs), which have increasingly deep and complex structures accompanied by the achievement of higher accuracy. Since most of these network compression techniques cause a decrease in accuracy, fine-tuning is essential to recover the performance of lightweight models; however, fine-tuning has received limited research attention compared to numerous compression techniques, and thus, performance recovery by fine-tuning has significant room for improvement. In this paper, we analyze the shortcomings of existing fine-tuning methods in terms of loss landscape and introduce a knowledge distillation (KD)-based fine-tuning approach that solves these problems. In particular, to overcome the limitation that KD can be adversely affected by the capacity difference between the teacher and student models or the defined knowledge to be transferred, we propose a feature distribution-aware knowledge distillation (FDKD) method, which defines appropriate supervision in the form of feature distribution to transfer the semantic information from teacher models. Moreover, we also propose a layer-wise FDKD method by exploiting the uniqueness of the lightweight model that the baseline (i.e., teacher) and compressed models (i.e., student) have the same architecture. Experiments on classification tasks demonstrate the superiority of the proposed method over existing fine-tuning methods, achieving up to 1.99% and 3.83% of accuracy improvement for pruned and quantized models, respectively. The source code for this implementation is available at [https://github.com/IDSL-SeoulTech/VFT].

Original languageEnglish
Article number111597
JournalEngineering Applications of Artificial Intelligence
Volume159
DOIs
StatePublished - 1 Nov 2025

Keywords

  • Convolutional neural network
  • Fine-tuning
  • Knowledge distillation
  • Loss landscape
  • Network compression

Fingerprint

Dive into the research topics of 'VFT: A versatile fine-tuning scheme based on feature distribution-aware knowledge distillation for lightweight convolutional neural networks'. Together they form a unique fingerprint.

Cite this