Speaking rate control based on time-scale modification and its effects on the performance of speech recognition

Jin Ah Kang, Seung Ho Choi

Research output: Contribution to journalArticlepeer-review

Abstract

In this paper, we describe the influence of speaking rate on speech recognition. Speaking rate of input speech is controlled by applying a time-scale modification (TSM) algorithm and speaking rate normalisation is achieved by selecting a scale factor of TSM. The scale factor selection for training and testing of a speech recognition system is performed based on a maximum likelihood criterion during HMM decoding. From the experimental results, we showed that optimal selection of a TSM scale factor in speaking rate normalisation can reduce WER by 47.6% compared to the baseline.

Original languageEnglish
Pages (from-to)31-36
Number of pages6
JournalInternational Journal of Engineering Systems Modelling and Simulation
Volume6
Issue number1-2
DOIs
StatePublished - 2014

Keywords

  • Speaking rate control
  • Speech recognition
  • Time-scale modification
  • TSM

Fingerprint

Dive into the research topics of 'Speaking rate control based on time-scale modification and its effects on the performance of speech recognition'. Together they form a unique fingerprint.

Cite this