TY - JOUR
T1 - Trunk Pruning
T2 - Highly Compatible Channel Pruning for Convolutional Neural Networks Without Fine-Tuning
AU - Kim, Nam Joon
AU - Kim, Hyun
N1 - Publisher Copyright:
© 1999-2012 IEEE.
PY - 2024
Y1 - 2024
N2 - Channel pruning can efficiently reduce the computation and memory footprint within a reasonable accuracy drop by removing unnecessary channels from convolutional neural networks (CNNs). Among the various channel pruning approaches, sparsity training is the most popular because of its convenient implementation and end-to-end training. It automatically identifies the optimal network structures by applying regularization to parameters. Although this sparsity training has achieved a remarkable performance in terms of the trade-off between accuracy and network size reduction, it needs to be accompanied by a time-consuming fine-tuning process. Moreover, although activation functions with high performance are being continuously developed, the existing sparsity training does not display remarkable scalability for these new activation functions. To address these problems, this study proposes a novel pruning method, trunk pruning, which can produce a compact network by minimizing the accuracy drop during inference even without the fine-tuning process. In the proposed method, one kernel of the next convolutional layer absorbs all the information of the kernels to be pruned, considering the effects of the batch normalization (BN) shift parameters remaining after the sparsity training. Therefore, it is possible to eliminate the fine-tuning process because trunk pruning can effectively reproduce the output of the unpruned network after the sparsity training by removing the pruning loss. Furthermore, because trunk pruning is a technique that can effectively control only the shift parameters of the BN in the CONV layer, it has the significant advantage of being compatible with all BN-based sparsity training schemes and can address various activation functions.
AB - Channel pruning can efficiently reduce the computation and memory footprint within a reasonable accuracy drop by removing unnecessary channels from convolutional neural networks (CNNs). Among the various channel pruning approaches, sparsity training is the most popular because of its convenient implementation and end-to-end training. It automatically identifies the optimal network structures by applying regularization to parameters. Although this sparsity training has achieved a remarkable performance in terms of the trade-off between accuracy and network size reduction, it needs to be accompanied by a time-consuming fine-tuning process. Moreover, although activation functions with high performance are being continuously developed, the existing sparsity training does not display remarkable scalability for these new activation functions. To address these problems, this study proposes a novel pruning method, trunk pruning, which can produce a compact network by minimizing the accuracy drop during inference even without the fine-tuning process. In the proposed method, one kernel of the next convolutional layer absorbs all the information of the kernels to be pruned, considering the effects of the batch normalization (BN) shift parameters remaining after the sparsity training. Therefore, it is possible to eliminate the fine-tuning process because trunk pruning can effectively reproduce the output of the unpruned network after the sparsity training by removing the pruning loss. Furthermore, because trunk pruning is a technique that can effectively control only the shift parameters of the BN in the CONV layer, it has the significant advantage of being compatible with all BN-based sparsity training schemes and can address various activation functions.
KW - Convolutional Neural Network (CNN)
KW - Fine-Tuning
KW - Pruning
KW - Regularization
UR - http://www.scopus.com/inward/record.url?scp=85184829159&partnerID=8YFLogxK
U2 - 10.1109/TMM.2023.3338052
DO - 10.1109/TMM.2023.3338052
M3 - Article
AN - SCOPUS:85184829159
SN - 1520-9210
VL - 26
SP - 5588
EP - 5599
JO - IEEE Transactions on Multimedia
JF - IEEE Transactions on Multimedia
ER -