TY - JOUR
T1 - Interaural time difference estimation using generalized cross-correlation with maximum likelihood weighting in reverberant environments
AU - Park, Ji Hun
AU - Choi, Seung Ho
PY - 2014
Y1 - 2014
N2 - In this paper, an interaural time difference (ITD) estimation method is proposed for binaural speech separation in reverberant environments. First, the auditory signals are represented in the time-frequency (T-F) domain, and the ITD for each T-F bin is then estimated using generalized cross-correlation (GCC) with a maximum likelihood (ML) weighting function. In particular, the ML weighting function is designed to reduce the reverberation effect. Then, a mask is estimated by comparing the estimated ITD with the ITD corresponding to the location of the pre-defined target speech source. Finally, the target speech is separated by applying the mask to the auditory signals. It is shown that the proposed ITD estimation method outperforms a conventional cross-correlation-based ITD estimation method under reverberant conditions in terms of the signal-to-noise ratio (SNR) and signal-to-distortion ratio (SDR) of the separated speech signals.
AB - In this paper, an interaural time difference (ITD) estimation method is proposed for binaural speech separation in reverberant environments. First, the auditory signals are represented in the time-frequency (T-F) domain, and the ITD for each T-F bin is then estimated using generalized cross-correlation (GCC) with a maximum likelihood (ML) weighting function. In particular, the ML weighting function is designed to reduce the reverberation effect. Then, a mask is estimated by comparing the estimated ITD with the ITD corresponding to the location of the pre-defined target speech source. Finally, the target speech is separated by applying the mask to the auditory signals. It is shown that the proposed ITD estimation method outperforms a conventional cross-correlation-based ITD estimation method under reverberant conditions in terms of the signal-to-noise ratio (SNR) and signal-to-distortion ratio (SDR) of the separated speech signals.
KW - Binaural speech separation
KW - Generalized cross-correlation
KW - Interaural time difference
KW - Maximum likelihood weighting
KW - Reverberant environment
UR - https://www.scopus.com/pages/publications/84899683155
U2 - 10.14257/ijmue.2014.9.4.05
DO - 10.14257/ijmue.2014.9.4.05
M3 - Article
AN - SCOPUS:84899683155
SN - 1975-0080
VL - 9
SP - 43
EP - 50
JO - International Journal of Multimedia and Ubiquitous Engineering
JF - International Journal of Multimedia and Ubiquitous Engineering
IS - 4
ER -