TY - GEN
T1 - LCDet
T2 - 30th IEEE Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2017
AU - Tripathi, Subarna
AU - Dane, Gokce
AU - Kang, Byeongkeun
AU - Bhaskaran, Vasudev
AU - Nguyen, Truong
N1 - Publisher Copyright:
© 2017 IEEE.
PY - 2017/8/22
Y1 - 2017/8/22
N2 - Deep Convolutional Neural Networks (CNN) are the state-of-the-art performers for the object detection task. It is well known that object detection requires more com- putation and memory than image classification. In this work, we propose LCDet, a fully-convolutional neural net- work for generic object detection that aims to work in em- bedded systems. We design and develop an end-to-end TensorFlow(TF)-based model. The detection works by a single forward pass through the network. Additionally, we employ 8-bit quantization on the learned weights. As a use case, we choose face detection and train the proposed model on images containing a varying number of faces of different sizes. We evaluate the face detection perfor- mance on publicly available dataset FDDB and Widerface. Our experimental results show that the proposed method achieves comparative accuracy comparing with state-of- the-art CNN-based face detection methods while reducing the model size by 3× and memory-BW by 3 - 4× compar- ing with one of the best real-time CNN-based object de- tector YOLO [23]. Our 8-bit fixed-point TF-model pro- vides additional 4× memory reduction while keeping the accuracy nearly as good as the floating point model and achieves 20× performance gain compared to the floating point model. Thus the proposed model is amenable for em- bedded implementations and is generic to be extended to any number of categories of objects.
AB - Deep Convolutional Neural Networks (CNN) are the state-of-the-art performers for the object detection task. It is well known that object detection requires more com- putation and memory than image classification. In this work, we propose LCDet, a fully-convolutional neural net- work for generic object detection that aims to work in em- bedded systems. We design and develop an end-to-end TensorFlow(TF)-based model. The detection works by a single forward pass through the network. Additionally, we employ 8-bit quantization on the learned weights. As a use case, we choose face detection and train the proposed model on images containing a varying number of faces of different sizes. We evaluate the face detection perfor- mance on publicly available dataset FDDB and Widerface. Our experimental results show that the proposed method achieves comparative accuracy comparing with state-of- the-art CNN-based face detection methods while reducing the model size by 3× and memory-BW by 3 - 4× compar- ing with one of the best real-time CNN-based object de- tector YOLO [23]. Our 8-bit fixed-point TF-model pro- vides additional 4× memory reduction while keeping the accuracy nearly as good as the floating point model and achieves 20× performance gain compared to the floating point model. Thus the proposed model is amenable for em- bedded implementations and is generic to be extended to any number of categories of objects.
UR - https://www.scopus.com/pages/publications/85030211215
U2 - 10.1109/CVPRW.2017.56
DO - 10.1109/CVPRW.2017.56
M3 - Conference contribution
AN - SCOPUS:85030211215
T3 - IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops
SP - 411
EP - 420
BT - Proceedings - 30th IEEE Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2017
PB - IEEE Computer Society
Y2 - 21 July 2017 through 26 July 2017
ER -