Interaural time difference estimation using generalized cross-correlation with maximum likelihood weighting in reverberant environments

Ji Hun Park, Seung Ho Choi

Research output: Contribution to journalArticlepeer-review

1 Scopus citations

Abstract

In this paper, an interaural time difference (ITD) estimation method is proposed for binaural speech separation in reverberant environments. First, the auditory signals are represented in the time-frequency (T-F) domain, and the ITD for each T-F bin is then estimated using generalized cross-correlation (GCC) with a maximum likelihood (ML) weighting function. In particular, the ML weighting function is designed to reduce the reverberation effect. Then, a mask is estimated by comparing the estimated ITD with the ITD corresponding to the location of the pre-defined target speech source. Finally, the target speech is separated by applying the mask to the auditory signals. It is shown that the proposed ITD estimation method outperforms a conventional cross-correlation-based ITD estimation method under reverberant conditions in terms of the signal-to-noise ratio (SNR) and signal-to-distortion ratio (SDR) of the separated speech signals.

Original languageEnglish
Pages (from-to)43-50
Number of pages8
JournalInternational Journal of Multimedia and Ubiquitous Engineering
Volume9
Issue number4
DOIs
StatePublished - 2014

Keywords

  • Binaural speech separation
  • Generalized cross-correlation
  • Interaural time difference
  • Maximum likelihood weighting
  • Reverberant environment

Fingerprint

Dive into the research topics of 'Interaural time difference estimation using generalized cross-correlation with maximum likelihood weighting in reverberant environments'. Together they form a unique fingerprint.

Cite this