A non-intrusive speech intelligibility estimation method based on deep learning using autoencoder features

Yoonhee Kim, Deokgyu Yun, Hannah Lee, Seung Ho Choi

Research output: Contribution to journalArticlepeer-review

3 Scopus citations

Abstract

This paper presents a deep learning-based non-intrusive speech intelligibility estimation method using bottleneck features of autoencoder. The conventional standard non-intrusive speech intelligibility estimation method, P.563, lacks intelligibility estimation performance in various noise environments. We propose a more accurate speech intelligibility estimation method based on long-short term memory (LSTM) neural network whose input and output are an autoencoder bottleneck features and a short-time objective intelligence (STOI) score, respectively, where STOI is a standard tool for measuring intrusive speech intelligibility with reference speech signals. We showed that the proposed method has a superior performance by comparing with the conventional standard P.563 and mel-frequency cepstral coefficient (MFCC) feature-based intelligibility estimation methods for speech signals in various noise environments.

Original languageEnglish
Pages (from-to)714-715
Number of pages2
JournalIEICE Transactions on Information and Systems
VolumeE103D
Issue number3
DOIs
StatePublished - 2020

Keywords

  • Autoencoder
  • Bottleneck feature
  • Deep learning
  • Long short-term memory (LSTM)
  • STOI

Fingerprint

Dive into the research topics of 'A non-intrusive speech intelligibility estimation method based on deep learning using autoencoder features'. Together they form a unique fingerprint.

Cite this