TY - GEN
T1 - Extreme Pruning Technique Based on Filter Deactivation Using Sparsity Training for Deep Convolutional Neural Networks
AU - Koo, Kwanghyun
AU - Kim, Hyun
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - Pruning represents a significant compression performance in deep convolutional neural networks and has been a subject of research in two major directions: filter pruning and weight pruning. Filter pruning, as a method for eliminating 3-dimensional parameters known as filters, exhibits notable advantages in terms of high acceleration performance. However, it presents challenges when applied to complex network structures and may yield relatively lower compression performance. On the other hand, weight pruning, which involves the removal of 1-dimensional parameters, proves to be a viable option for complex network structures due to its ability to achieve high compression performance. Nevertheless, it does have limitations when it comes to achieving acceleration in a general GPU environment. To solve this problem, a kernel pruning method was proposed that removes the 2-dimensional parameter kernels. Kernel pruning has the advantage of being easier to accelerate in hardware than weight pruning while achieving a relatively high pruning rate. Nonetheless, in the process of employing kernel pruning, a phenomenon known as filter deactivation emerges, where certain filters generate outputs but remain entirely unused in the subsequent layer. In this paper, we present a novel approach to create more deactivated filters through sparsity training by taking advantage of the fact that deactivated filters do not affect the performance of the network even if they are removed, and consequently achieve a higher pruning rate. By applying the proposed method for kernel pruning, we achieve a performance enhancement of 0.22%, while successfully removing an additional 2.58% of parameters in ResNet-110 when evaluated on the CIFAR-10 dataset.
AB - Pruning represents a significant compression performance in deep convolutional neural networks and has been a subject of research in two major directions: filter pruning and weight pruning. Filter pruning, as a method for eliminating 3-dimensional parameters known as filters, exhibits notable advantages in terms of high acceleration performance. However, it presents challenges when applied to complex network structures and may yield relatively lower compression performance. On the other hand, weight pruning, which involves the removal of 1-dimensional parameters, proves to be a viable option for complex network structures due to its ability to achieve high compression performance. Nevertheless, it does have limitations when it comes to achieving acceleration in a general GPU environment. To solve this problem, a kernel pruning method was proposed that removes the 2-dimensional parameter kernels. Kernel pruning has the advantage of being easier to accelerate in hardware than weight pruning while achieving a relatively high pruning rate. Nonetheless, in the process of employing kernel pruning, a phenomenon known as filter deactivation emerges, where certain filters generate outputs but remain entirely unused in the subsequent layer. In this paper, we present a novel approach to create more deactivated filters through sparsity training by taking advantage of the fact that deactivated filters do not affect the performance of the network even if they are removed, and consequently achieve a higher pruning rate. By applying the proposed method for kernel pruning, we achieve a performance enhancement of 0.22%, while successfully removing an additional 2.58% of parameters in ResNet-110 when evaluated on the CIFAR-10 dataset.
KW - Convolutional neural network
KW - Kernel pruning
KW - Network compression
KW - Sparsity training
UR - https://www.scopus.com/pages/publications/85189245682
U2 - 10.1109/ICEIC61013.2024.10457126
DO - 10.1109/ICEIC61013.2024.10457126
M3 - Conference contribution
AN - SCOPUS:85189245682
T3 - 2024 International Conference on Electronics, Information, and Communication, ICEIC 2024
BT - 2024 International Conference on Electronics, Information, and Communication, ICEIC 2024
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2024 International Conference on Electronics, Information, and Communication, ICEIC 2024
Y2 - 28 January 2024 through 31 January 2024
ER -