TY - CONF
T1 - Anti-Litter Surveillance based on Person Understanding via Multi-Task Learning
AU - Bae, Kangmin
AU - Yun, Kimin
AU - Kim, Hyungil
AU - Lee, Youngwan
AU - Park, Jongyoul
N1 - Publisher Copyright:
© 2020. The copyright of this document resides with its authors. It may be distributed unchanged freely in print or electronic forms.
PY - 2020
Y1 - 2020
N2 - In this paper, we propose a new framework for an anti-litter visual surveillance system to prevent garbage dumping as a real-world application. There have been many efforts to deploy an action recognition based visual surveillance system. However, many conventional methods were overfitted for only specific scenes due to hand-crafted rules and lack of real-world data. To overcome this problem, we propose a novel algorithm that handles the diverse scene properties of the real-world surveillance. In addition to collecting data from the real-world, we train the effective model to understand the person through multiple datasets such as human poses, human coarse action (e.g., upright, bent), and fine action (e.g., pushing a cart) via multi-task learning. As a result, our approach eliminates the need for scene-by-scene tuning and provides robustness to behavior understanding performance in a visual surveillance system. In addition, we propose a new object detection network that is optimized for detecting carryable objects and a person. The proposed detection network reduces the computational cost by specifying potential suspects only to the person who carries an object. Our method outperforms the state-of-the-art methods in detecting the garbage dumping action on real-world surveillance video dataset.
AB - In this paper, we propose a new framework for an anti-litter visual surveillance system to prevent garbage dumping as a real-world application. There have been many efforts to deploy an action recognition based visual surveillance system. However, many conventional methods were overfitted for only specific scenes due to hand-crafted rules and lack of real-world data. To overcome this problem, we propose a novel algorithm that handles the diverse scene properties of the real-world surveillance. In addition to collecting data from the real-world, we train the effective model to understand the person through multiple datasets such as human poses, human coarse action (e.g., upright, bent), and fine action (e.g., pushing a cart) via multi-task learning. As a result, our approach eliminates the need for scene-by-scene tuning and provides robustness to behavior understanding performance in a visual surveillance system. In addition, we propose a new object detection network that is optimized for detecting carryable objects and a person. The proposed detection network reduces the computational cost by specifying potential suspects only to the person who carries an object. Our method outperforms the state-of-the-art methods in detecting the garbage dumping action on real-world surveillance video dataset.
UR - http://www.scopus.com/inward/record.url?scp=85102223792&partnerID=8YFLogxK
M3 - Paper
AN - SCOPUS:85102223792
T2 - 31st British Machine Vision Conference, BMVC 2020
Y2 - 7 September 2020 through 10 September 2020
ER -