Abstract
This paper proposes a non-intrusive speech intelligibility estimation method with no reference speech signal, which is based on recurrent neural network (RNN) with long short-term memory (LSTM) structure. Conventional standard estimation method P.563 has poor estimation performance and lack of consistency especially in various noise and reverberation environments. The proposed method trains the LSTM RNN model parameters by utilizing the STOI that is the standard intelligibility estimation method with reference speech signal. The input and output of the LSTM RNN are the MFCC vector and the frame-wise STOI values. Experimental results show that the proposed intelligibility estimation method outperforms the conventional standard P.563 in various noise and reverberation environments.
| Translated title of the contribution | A Non-Intrusive Speech Intelligibility Estimation Method Based on Recurrent Neural Network with Long Short-Term Memory |
|---|---|
| Original language | Korean |
| Pages (from-to) | 1736-1738 |
| Number of pages | 3 |
| Journal | 한국통신학회논문지 |
| Volume | 42 |
| Issue number | 9 |
| DOIs | |
| State | Published - Sep 2017 |