TY - GEN
T1 - Real-time sign language fingerspelling recognition using convolutional neural networks from depth map
AU - Kang, Byeongkeun
AU - Tripathi, Subarna
AU - Nguyen, Truong Q.
N1 - Publisher Copyright:
© 2015 IEEE.
PY - 2016/6/7
Y1 - 2016/6/7
N2 - Sign language recognition is important for natural and convenient communication between deaf community and hearing majority. We take the highly efficient initial step of automatic fingerspelling recognition system using convolutional neural networks (CNNs) from depth maps. In this work, we consider relatively larger number of classes compared with the previous literature. We train CNNs for the classification of 31 alphabets and numbers using a subset of collected depth data from multiple subjects. While using different learning configurations, such as hyper-parameter selection with and without validation, we achieve 99.99% accuracy for observed signers and 83.58% to 85.49% accuracy for new signers. The result shows that accuracy improves as we include more data from different subjects during training. The processing time is 3 ms for the prediction of a single image. To the best of our knowledge, the system achieves the highest accuracy and speed. The trained model and dataset is available on our repository1.
AB - Sign language recognition is important for natural and convenient communication between deaf community and hearing majority. We take the highly efficient initial step of automatic fingerspelling recognition system using convolutional neural networks (CNNs) from depth maps. In this work, we consider relatively larger number of classes compared with the previous literature. We train CNNs for the classification of 31 alphabets and numbers using a subset of collected depth data from multiple subjects. While using different learning configurations, such as hyper-parameter selection with and without validation, we achieve 99.99% accuracy for observed signers and 83.58% to 85.49% accuracy for new signers. The result shows that accuracy improves as we include more data from different subjects during training. The processing time is 3 ms for the prediction of a single image. To the best of our knowledge, the system achieves the highest accuracy and speed. The trained model and dataset is available on our repository1.
UR - http://www.scopus.com/inward/record.url?scp=84978870819&partnerID=8YFLogxK
U2 - 10.1109/ACPR.2015.7486481
DO - 10.1109/ACPR.2015.7486481
M3 - Conference contribution
AN - SCOPUS:84978870819
T3 - Proceedings - 3rd IAPR Asian Conference on Pattern Recognition, ACPR 2015
SP - 136
EP - 140
BT - Proceedings - 3rd IAPR Asian Conference on Pattern Recognition, ACPR 2015
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 3rd IAPR Asian Conference on Pattern Recognition, ACPR 2015
Y2 - 3 November 2016 through 6 November 2016
ER -