SIDC-KWS: Efficient Spiking Inception-Dilated Conformer with Self-Attention for Keyword Spotting

Research output: Contribution to journalConference articlepeer-review

Abstract

Recent deep learning advances have improved keyword spotting (KWS). However, as KWS is deployed on edge devices, energy efficiency remains a key challenge. Conventional deep neural networks offer high accuracy but require heavy computation, making them unsuitable for low-power use. To address this, we propose the Spiking Inception-Dilated Conformer for Keyword Spotting (SIDC-KWS), an energy-efficient transformer based on spiking neural networks (SNNs). By integrating an Inception-Dilated (ID) block and spike-based self-attention, SIDC-KWS maintains high accuracy while significantly reducing power consumption. Experiments on the Google Speech Commands V2 (GSC V2) dataset show that SIDC-KWS achieves 96.8% and 94.7% accuracy on 12-class and 35-class tasks, respectively. On the 35-class task, SIDC-KWS consumes 75.59% less energy than its ANN counterpart. These results underscore SNNs as a scalable, low-power alternative for real-time KWS in resource-limited environments.

Original languageEnglish
Pages (from-to)2665-2669
Number of pages5
JournalProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
DOIs
StatePublished - 2025
Event26th Interspeech Conference 2025 - Rotterdam, Netherlands
Duration: 17 Aug 202521 Aug 2025

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

  1. SDG 7 - Affordable and Clean Energy
    SDG 7 Affordable and Clean Energy

Keywords

  • Energy Efficiency
  • Keyword Spotting
  • Speech Recognition
  • Spiking Neural Network
  • Transformer

Fingerprint

Dive into the research topics of 'SIDC-KWS: Efficient Spiking Inception-Dilated Conformer with Self-Attention for Keyword Spotting'. Together they form a unique fingerprint.

Cite this