Low-complexity object detection with deep convolutional neural network for embedded systems

Subarna Tripathi, Byeongkeun Kang, Gokce Dane, Truong Nguyen

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

11 Scopus citations

Abstract

We investigate low-complexity convolutional neural networks (CNNs) for object detection for embedded vision applications. It is well-known that consolidation of an embedded system for CNN-based object detection is more challenging due to computation and memory requirement comparing with problems like image classification. To achieve these requirements, we design and develop an end-to-end TensorFlow (TF)-based fully-convolutional deep neural network for generic object detection task inspired by one of the fastest framework, YOLO.1 The proposed network predicts the localization of every object by regressing the coordinates of the corresponding bounding box as in YOLO. Hence, the network is able to detect any objects without any limitations in the size of the objects. However, unlike YOLO, all the layers in the proposed network is fully-convolutional. Thus, it is able to take input images of any size. We pick face detection as an use case. We evaluate the proposed model for face detection on FDDB dataset and Widerface dataset. As another use case of generic object detection, we evaluate its performance on PASCAL VOC dataset. The experimental results demonstrate that the proposed network can predict object instances of different sizes and poses in a single frame. Moreover, the results show that the proposed method achieves comparative accuracy comparing with the state-of-the-art CNN-based object detection methods while reducing the model size by 3× and memory-BW by 3 - 4× comparing with one of the best real-time CNN-based object detectors, YOLO. Our 8-bit fixed-point TF-model provides additional 4× memory reduction while keeping the accuracy nearly as good as the floating-point model. Moreover, the fixed- point model is capable of achieving 20× faster inference speed comparing with the floating-point model. Thus, the proposed method is promising for embedded implementations.

Original languageEnglish
Title of host publicationApplications of Digital Image Processing XL
EditorsAndrew G. Tescher
PublisherSPIE
ISBN (Electronic)9781510612495
DOIs
StatePublished - 2017
EventApplications of Digital Image Processing XL 2017 - San Diego, United States
Duration: 7 Aug 201710 Aug 2017

Publication series

NameProceedings of SPIE - The International Society for Optical Engineering
Volume10396
ISSN (Print)0277-786X
ISSN (Electronic)1996-756X

Conference

ConferenceApplications of Digital Image Processing XL 2017
Country/TerritoryUnited States
CitySan Diego
Period7/08/1710/08/17

Keywords

  • Convolutional Neural Networks
  • Embedded Systems
  • Face Detection
  • Object Detection

Fingerprint

Dive into the research topics of 'Low-complexity object detection with deep convolutional neural network for embedded systems'. Together they form a unique fingerprint.

Cite this