A cepstral PDF normalization method for noise robust speech recognition

Yong Ho Suk, Seung Ho Choi

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

In this paper, we propose a novel cepstrum normalization method based on the scoring procedure of order statistics for speech recognition in additive noise environments. The conventional methods normalize the mean and/or variance of the cepstrum, which results in an incomplete normalization of the probability density function (PDF). The proposed method fully normalizes the PDF of the cepstrum, providing an identical PDF between clean and noisy cepstrum. For the target PDF, the generalized Gaussian distribution is selected to consider various densities. In recognition phase, a table lookup method is devised in order to save computational costs. From the speaker-independent isolated-word recognition experiments, we show that the proposed method gives improved performance compared with that of the conventional methods, especially in heavy noise environments.

Original languageEnglish
Title of host publicationAdvances in Computer Science, Environment, Ecoinformatics, and Education - International Conference, CSEE 2011, Proceedings
Pages34-39
Number of pages6
EditionPART 2
DOIs
StatePublished - 2011
EventInternational Conference on Advances in Computer Science, Environment, Ecoinformatics, and Education, CSEE 2011 - Wuhan, China
Duration: 21 Aug 201122 Aug 2011

Publication series

NameCommunications in Computer and Information Science
NumberPART 2
Volume215 CCIS
ISSN (Print)1865-0929

Conference

ConferenceInternational Conference on Advances in Computer Science, Environment, Ecoinformatics, and Education, CSEE 2011
Country/TerritoryChina
CityWuhan
Period21/08/1122/08/11

Keywords

  • Cepstrum normalization
  • generalized Gaussian distribution
  • noisy speech recognition
  • order statistics

Fingerprint

Dive into the research topics of 'A cepstral PDF normalization method for noise robust speech recognition'. Together they form a unique fingerprint.

Cite this