TY - JOUR
T1 - Improving Performance of Real-Time Object Detection in Edge Device Through Concurrent Multi-Frame Processing
AU - Kim, Seunghwan
AU - Kim, Changjong
AU - Kim, Sunggon
N1 - Publisher Copyright:
© 2013 IEEE.
PY - 2025
Y1 - 2025
N2 - As the performance and accuracy of machine learning and AI algorithms improve, the demand for adopting computer vision techniques to solve various problems, such as autonomous driving and AI robots, increases. To meet such demand, IoT and edge devices, which are small enough to be adopted in various environments while having sufficient computing capabilities, are being widely adopted. However, as devices are utilized in IoT and edge environments, which have harsh restrictions compared to traditional server environments, they are often limited by low computational and memory resources, in addition to the limited electrical power supply. This necessitates a unique approach for small IoT devices that are required to run complex tasks. In this paper, we propose a concurrent multi-frame processing scheme for real-time object detection algorithms. To do this, we first divide the video into individual frames and group the frames according to the number of cores in the device. Then, we allocate a group of frames per core to perform the object detection, resulting in parallel detection of multiple frames. We implement our scheme in YOLO (You Only Look Once), one of the most popular real-time object detection algorithms, on a state-of-the-art, resource-constrained IoT edge device, Nvidia Jetson Orin Nano, using real-world video and image datasets, including MS-COCO, ImageNet, PascalVOC, DOTA, animal videos, and car-traffic videos. Our evaluation results show that our proposed scheme can improve the diverse aspect of edge performance and improve the runtime, memory consumption, and power usage by up to 445%, 69%, and 73%, respectively. Additionally, it demonstrates improvements of 2.10× over state-of-the-art model optimization.
AB - As the performance and accuracy of machine learning and AI algorithms improve, the demand for adopting computer vision techniques to solve various problems, such as autonomous driving and AI robots, increases. To meet such demand, IoT and edge devices, which are small enough to be adopted in various environments while having sufficient computing capabilities, are being widely adopted. However, as devices are utilized in IoT and edge environments, which have harsh restrictions compared to traditional server environments, they are often limited by low computational and memory resources, in addition to the limited electrical power supply. This necessitates a unique approach for small IoT devices that are required to run complex tasks. In this paper, we propose a concurrent multi-frame processing scheme for real-time object detection algorithms. To do this, we first divide the video into individual frames and group the frames according to the number of cores in the device. Then, we allocate a group of frames per core to perform the object detection, resulting in parallel detection of multiple frames. We implement our scheme in YOLO (You Only Look Once), one of the most popular real-time object detection algorithms, on a state-of-the-art, resource-constrained IoT edge device, Nvidia Jetson Orin Nano, using real-world video and image datasets, including MS-COCO, ImageNet, PascalVOC, DOTA, animal videos, and car-traffic videos. Our evaluation results show that our proposed scheme can improve the diverse aspect of edge performance and improve the runtime, memory consumption, and power usage by up to 445%, 69%, and 73%, respectively. Additionally, it demonstrates improvements of 2.10× over state-of-the-art model optimization.
KW - Edge devices
KW - machine learning
KW - object detection
KW - performance optimization
UR - https://www.scopus.com/pages/publications/85212933577
U2 - 10.1109/ACCESS.2024.3520240
DO - 10.1109/ACCESS.2024.3520240
M3 - Article
AN - SCOPUS:85212933577
SN - 2169-3536
VL - 13
SP - 1522
EP - 1533
JO - IEEE Access
JF - IEEE Access
ER -