Abstract
In this paper, we describe the influence of speaking rate on speech recognition. Speaking rate of input speech is controlled by applying a time-scale modification (TSM) algorithm and speaking rate normalisation is achieved by selecting a scale factor of TSM. The scale factor selection for training and testing of a speech recognition system is performed based on a maximum likelihood criterion during HMM decoding. From the experimental results, we showed that optimal selection of a TSM scale factor in speaking rate normalisation can reduce WER by 47.6% compared to the baseline.
| Original language | English |
|---|---|
| Pages (from-to) | 31-36 |
| Number of pages | 6 |
| Journal | International Journal of Engineering Systems Modelling and Simulation |
| Volume | 6 |
| Issue number | 1-2 |
| DOIs | |
| State | Published - 2014 |
Keywords
- Speaking rate control
- Speech recognition
- Time-scale modification
- TSM