Abstract
In this paper, an interaural time difference (ITD) estimation method is proposed for binaural speech separation in reverberant environments. First, the auditory signals are represented in the time-frequency (T-F) domain, and the ITD for each T-F bin is then estimated using generalized cross-correlation (GCC) with a maximum likelihood (ML) weighting function. In particular, the ML weighting function is designed to reduce the reverberation effect. Then, a mask is estimated by comparing the estimated ITD with the ITD corresponding to the location of the pre-defined target speech source. Finally, the target speech is separated by applying the mask to the auditory signals. It is shown that the proposed ITD estimation method outperforms a conventional cross-correlation-based ITD estimation method under reverberant conditions in terms of the signal-to-noise ratio (SNR) and signal-to-distortion ratio (SDR) of the separated speech signals.
| Original language | English |
|---|---|
| Pages (from-to) | 43-50 |
| Number of pages | 8 |
| Journal | International Journal of Multimedia and Ubiquitous Engineering |
| Volume | 9 |
| Issue number | 4 |
| DOIs | |
| State | Published - 2014 |
Keywords
- Binaural speech separation
- Generalized cross-correlation
- Interaural time difference
- Maximum likelihood weighting
- Reverberant environment
Fingerprint
Dive into the research topics of 'Interaural time difference estimation using generalized cross-correlation with maximum likelihood weighting in reverberant environments'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver