A deep learning-based approach to non-intrusive objective speech intelligibility estimation

Deokgyu Yun, Hannah Lee, Seung Ho Choi

Research output: Contribution to journalArticlepeer-review

12 Scopus citations

Abstract

This paper proposes a deep learning-based non-intrusive objective speech intelligibility estimation method based on recurrent neural network (RNN) with long short-Term memory (LSTM) structure. Conventional non-intrusive estimation methods such as standard P.563 have poor estimation performance and lack of consistency, especially, in various noise and reverberation environments. The proposed method trains the LSTM RNN model parameters by utilizing the STOI that is the standard intrusive intelligibility estimation method with reference speech signal. The input and output of the LSTM RNN are the MFCC vector and the frame-wise STOI value, respectively. Experimental results show that the proposed objective intelligibility estimation method outperforms the conventional standard P.563 in various noisy and reverberant environments.

Original languageEnglish
Pages (from-to)1207-1208
Number of pages2
JournalIEICE Transactions on Information and Systems
VolumeE101D
Issue number4
DOIs
StatePublished - Apr 2018

Keywords

  • LSTM
  • Non-intrusive
  • RNN
  • Speech intelligibility
  • STOI

Fingerprint

Dive into the research topics of 'A deep learning-based approach to non-intrusive objective speech intelligibility estimation'. Together they form a unique fingerprint.

Cite this