TY - JOUR
T1 - EchoTap
T2 - Non-Verbal Sound Interaction with Knock and Tap Gestures
AU - Jeong, Jae Yeop
AU - Kim, Daun
AU - Jeong, Jin Woo
N1 - Publisher Copyright:
© 2024 Taylor & Francis Group, LLC.
PY - 2025
Y1 - 2025
N2 - The growing demand for highly accessible interaction technologies to effectively interact with smart devices has led to the increasing popularity of voice user interfaces (VUIs). However, VUIs face interpretation challenges stemming from the variability of natural language input, such as speech clarity issues, linguistic variability, and speech impediments. As an alternative, non-verbal sound-based interaction techniques emerge as highly advantageous for smart device control, mitigating the inherent challenges of VUIs. In this article, we introduce EchoTap, a novel audio interface that harnesses the distinctive sound responses generated by knock and tap gestures on target objects. Employing deep neural networks, EchoTap recognizes both the type and location of these gestures based on their unique sound signatures. Through offline evaluation, EchoTap demonstrated competitive classification accuracy (88% on average) and localization precision (93% on average). Moreover, a user study involving 12 participants validated EchoTap’s practical effectiveness and user-friendliness in real-world scenarios. This study highlights EchoTap’s potential for various daily interaction contexts and discusses further design implications for leveraging auditory interfaces based on simple gestures.
AB - The growing demand for highly accessible interaction technologies to effectively interact with smart devices has led to the increasing popularity of voice user interfaces (VUIs). However, VUIs face interpretation challenges stemming from the variability of natural language input, such as speech clarity issues, linguistic variability, and speech impediments. As an alternative, non-verbal sound-based interaction techniques emerge as highly advantageous for smart device control, mitigating the inherent challenges of VUIs. In this article, we introduce EchoTap, a novel audio interface that harnesses the distinctive sound responses generated by knock and tap gestures on target objects. Employing deep neural networks, EchoTap recognizes both the type and location of these gestures based on their unique sound signatures. Through offline evaluation, EchoTap demonstrated competitive classification accuracy (88% on average) and localization precision (93% on average). Moreover, a user study involving 12 participants validated EchoTap’s practical effectiveness and user-friendliness in real-world scenarios. This study highlights EchoTap’s potential for various daily interaction contexts and discusses further design implications for leveraging auditory interfaces based on simple gestures.
KW - gesture classification
KW - gesture localization
KW - Non-verbal sound
KW - sound interface
KW - usability
UR - http://www.scopus.com/inward/record.url?scp=105001071295&partnerID=8YFLogxK
U2 - 10.1080/10447318.2024.2348837
DO - 10.1080/10447318.2024.2348837
M3 - Article
AN - SCOPUS:105001071295
SN - 1044-7318
VL - 41
SP - 4189
EP - 4210
JO - International Journal of Human-Computer Interaction
JF - International Journal of Human-Computer Interaction
IS - 7
ER -