Knock&Tap: Classification and Localization of Knock and Tap Gestures using Deep Sound Transfer Learning

Jae Yeop Jeong, Jung Hwa Kim, Ha Yeong Yoon, Jin Woo Jeong

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

2 Scopus citations

Abstract

Gesture interaction is considered one of the promising approaches to control smart devices. In this paper, we present Knock&Tap, an audio-based approach that can perform gesture classification and gesture localization using deep transfer learning. Knock&Tap consists of a single 4-microphone array to record the sound of the user's knocking and tapping gestures and a wood/glass panel for knocking and tapping. Knock&Tap can be used in a situation or environment where vision-based gesture recognition is impossible due to the lighting condition or camera installation issue. Various experiments were conducted to validate the feasibility of Knock&Tap with 7 gesture types on both wood and glass panels. Our experimental results show that Knock&Tap predicts the gesture type and location with an accuracy of up to 97.24% and 92.05%, respectively.

Original languageEnglish
Title of host publicationICMI 2021 Companion - Companion Publication of the 2021 International Conference on Multimodal Interaction
PublisherAssociation for Computing Machinery, Inc
Pages1-6
Number of pages6
ISBN (Electronic)9781450384711
DOIs
StatePublished - 18 Oct 2021
Event23rd ACM International Conference on Multimodal Interaction, ICMI 2021 - Virtual, Online, Canada
Duration: 18 Oct 202122 Oct 2021

Publication series

NameICMI 2021 Companion - Companion Publication of the 2021 International Conference on Multimodal Interaction

Conference

Conference23rd ACM International Conference on Multimodal Interaction, ICMI 2021
Country/TerritoryCanada
CityVirtual, Online
Period18/10/2122/10/21

Keywords

  • Audio classification
  • Gesture recognition
  • Transfer learning

Fingerprint

Dive into the research topics of 'Knock&Tap: Classification and Localization of Knock and Tap Gestures using Deep Sound Transfer Learning'. Together they form a unique fingerprint.

Cite this