TY - JOUR
T1 - Speech recognition using quantized LSP parameters and their transformations in digital communication
AU - Choi, Seung Ho
AU - Kim, Hong Kook
AU - Lee, Hwang Soo
PY - 2000/4
Y1 - 2000/4
N2 - In digital communication networks, speech recognition systems conventionally first reconstruct speech and then extract feature parameters. In this paper, we consider a useful approach of incorporating speech coding parameters into the speech recognizer. Most speech coders employed in digital communication networks use line spectrum pairs (LSPs) as spectral parameters. We introduce two ways to improve the recognition performance of the LSP-based speech recognizer. One is to devise weighted distance measures of LSPs and the other is to transform LSPs into a new feature set, named pseudo-cepstrum (PCEP). The speaker-independent connected-digit recognition experiments based on the discrete hidden Markov model showed that the weighted distance measures provide better recognition accuracy than unweighted ones do. Additionally, a mel-scale PCEP gives an even better performance than the weighted distance measures do. To clarify the performance improvement of the proposed methods, a significance test is introduced. As a result, the proposed methods achieved higher performances in recognition accuracy, compared with the conventional methods employing mel-frequency cepstral coefficients.
AB - In digital communication networks, speech recognition systems conventionally first reconstruct speech and then extract feature parameters. In this paper, we consider a useful approach of incorporating speech coding parameters into the speech recognizer. Most speech coders employed in digital communication networks use line spectrum pairs (LSPs) as spectral parameters. We introduce two ways to improve the recognition performance of the LSP-based speech recognizer. One is to devise weighted distance measures of LSPs and the other is to transform LSPs into a new feature set, named pseudo-cepstrum (PCEP). The speaker-independent connected-digit recognition experiments based on the discrete hidden Markov model showed that the weighted distance measures provide better recognition accuracy than unweighted ones do. Additionally, a mel-scale PCEP gives an even better performance than the weighted distance measures do. To clarify the performance improvement of the proposed methods, a significance test is introduced. As a result, the proposed methods achieved higher performances in recognition accuracy, compared with the conventional methods employing mel-frequency cepstral coefficients.
UR - http://www.scopus.com/inward/record.url?scp=0033895068&partnerID=8YFLogxK
U2 - 10.1016/S0167-6393(99)00047-3
DO - 10.1016/S0167-6393(99)00047-3
M3 - Article
AN - SCOPUS:0033895068
SN - 0167-6393
VL - 30
SP - 223
EP - 233
JO - Speech Communication
JF - Speech Communication
IS - 4
ER -