TY - GEN
T1 - Mask-Soft Filter Pruning for Lightweight CNN Inference
AU - Kim, Nam Joon
AU - Kim, Hyun
N1 - Publisher Copyright:
© 2020 IEEE.
PY - 2020/10/21
Y1 - 2020/10/21
N2 - Pruning is a network compression and acceleration technique to reduce the computation and memory footprint required by Convolutional Neural Networks (CNNs), and various pruning techniques have been proposed by many researchers. Weight pruning (i.e., Unstructured pruning) can reduce many parameters by removing redundant weights, but it requires special software or hardware structure to actually accelerate the neural networks in the GPU environment. On the other hand, filter pruning to remove the filters itself does not require any special software or hardware structure, and consequently, it enables the actual acceleration of CNN in the GPU environment. Inspired by the previous research, soft filter pruning, which prunes the filters in a soft manner, this paper proposes Mask-Soft Filter Pruning(M-SFP) method. The proposed M-SFP is a pruning technique that can preserve weight parameters without zeroing out by masking the output feature maps. By applying the proposed technique to ResNet on CIFAR-10 and CIFAR-100 datasets, more than 40% reduction of BFLOPs can be achieved with only an acceptable accuracy drop of 0.17%.
AB - Pruning is a network compression and acceleration technique to reduce the computation and memory footprint required by Convolutional Neural Networks (CNNs), and various pruning techniques have been proposed by many researchers. Weight pruning (i.e., Unstructured pruning) can reduce many parameters by removing redundant weights, but it requires special software or hardware structure to actually accelerate the neural networks in the GPU environment. On the other hand, filter pruning to remove the filters itself does not require any special software or hardware structure, and consequently, it enables the actual acceleration of CNN in the GPU environment. Inspired by the previous research, soft filter pruning, which prunes the filters in a soft manner, this paper proposes Mask-Soft Filter Pruning(M-SFP) method. The proposed M-SFP is a pruning technique that can preserve weight parameters without zeroing out by masking the output feature maps. By applying the proposed technique to ResNet on CIFAR-10 and CIFAR-100 datasets, more than 40% reduction of BFLOPs can be achieved with only an acceptable accuracy drop of 0.17%.
KW - Deep learning
KW - image classification
KW - network compression
KW - pruning
UR - http://www.scopus.com/inward/record.url?scp=85100729006&partnerID=8YFLogxK
U2 - 10.1109/ISOCC50952.2020.9333054
DO - 10.1109/ISOCC50952.2020.9333054
M3 - Conference contribution
AN - SCOPUS:85100729006
T3 - Proceedings - International SoC Design Conference, ISOCC 2020
SP - 316
EP - 317
BT - Proceedings - International SoC Design Conference, ISOCC 2020
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 17th International System-on-Chip Design Conference, ISOCC 2020
Y2 - 21 October 2020 through 24 October 2020
ER -