TY - GEN
T1 - Implementation of Tiled Point-Wise Convolution in MobileNet for Parallel Processing
AU - Hong, Hyeon Seok
AU - Kim, Hyun
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - Convolutional neural networks (CNNs) have demonstrated outstanding performance in computer vision tasks. However, their massive computation makes the utilization of CNNs difficult on edge and mobile devices. To address this, lightweight CNNs (e.g., MobileNet) and dedicated FPGA accelerator designs have gained attention. However, the point-wise convolution (PWC) in MobileNet, which accounts for a significant portion of the computations, has a critical impact on latency, making optimization crucial for inference acceleration. In this study, we present a tiled PWC capable of parallel processing by partitioning feature maps into tiles and optimize this operation efficiently. The proposed design enables parallel PWC processing without additional controller modifications. As a result, we implement the proposed design on the Xilinx ZCU102 board platform and observe 4.0 x and 3.1 x improvements in throughput and power efficiency, respectively.
AB - Convolutional neural networks (CNNs) have demonstrated outstanding performance in computer vision tasks. However, their massive computation makes the utilization of CNNs difficult on edge and mobile devices. To address this, lightweight CNNs (e.g., MobileNet) and dedicated FPGA accelerator designs have gained attention. However, the point-wise convolution (PWC) in MobileNet, which accounts for a significant portion of the computations, has a critical impact on latency, making optimization crucial for inference acceleration. In this study, we present a tiled PWC capable of parallel processing by partitioning feature maps into tiles and optimize this operation efficiently. The proposed design enables parallel PWC processing without additional controller modifications. As a result, we implement the proposed design on the Xilinx ZCU102 board platform and observe 4.0 x and 3.1 x improvements in throughput and power efficiency, respectively.
KW - CNN accelerator
KW - FPGA
KW - MobileNetV1
UR - http://www.scopus.com/inward/record.url?scp=85189238896&partnerID=8YFLogxK
U2 - 10.1109/ICEIC61013.2024.10457207
DO - 10.1109/ICEIC61013.2024.10457207
M3 - Conference contribution
AN - SCOPUS:85189238896
T3 - 2024 International Conference on Electronics, Information, and Communication, ICEIC 2024
BT - 2024 International Conference on Electronics, Information, and Communication, ICEIC 2024
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2024 International Conference on Electronics, Information, and Communication, ICEIC 2024
Y2 - 28 January 2024 through 31 January 2024
ER -