TY - GEN
T1 - Binaural localization of speech sources in the median plane using cepstral hrtf extraction
AU - Talagala, Dumidu S.
AU - Wu, Xiang
AU - Zhang, Wen
AU - Abhayapala, Thushara D.
N1 - Publisher Copyright:
© 2014 EURASIP.
PY - 2014/11/10
Y1 - 2014/11/10
N2 - In binaural systems, source localization in the median plane is challenging due to the difficulty of exploring the spectral cues of the head-related transfer function (HRTF) independently of the source spectra. This paper presents a method of extracting the HRTF spectral cues using cepstral analysis for speech source localization in the median plane. Binaural signals are preprocessed in the cepstral domain so that the fine spectral structure of speech and the HRTF spectral envelope can be easily separated. We introduce (i) a truncated cepstral transformation to extract the relevant localization cues, and (ii) a mechanism to normalize the effects of the time varying speech spectra. The proposed method is evaluated and compared with a convolution based localization method using a speech corpus of multiple speakers. The results suggest that the proposed method fully exploits the available spectral cues for robust speaker independent binaural source localization in the median plane.
AB - In binaural systems, source localization in the median plane is challenging due to the difficulty of exploring the spectral cues of the head-related transfer function (HRTF) independently of the source spectra. This paper presents a method of extracting the HRTF spectral cues using cepstral analysis for speech source localization in the median plane. Binaural signals are preprocessed in the cepstral domain so that the fine spectral structure of speech and the HRTF spectral envelope can be easily separated. We introduce (i) a truncated cepstral transformation to extract the relevant localization cues, and (ii) a mechanism to normalize the effects of the time varying speech spectra. The proposed method is evaluated and compared with a convolution based localization method using a speech corpus of multiple speakers. The results suggest that the proposed method fully exploits the available spectral cues for robust speaker independent binaural source localization in the median plane.
KW - Binaural localization
KW - cepstral transformation
KW - head related transfer function (HRTF)
KW - median plane
UR - http://www.scopus.com/inward/record.url?scp=84911942485&partnerID=8YFLogxK
M3 - Conference contribution
T3 - European Signal Processing Conference
SP - 2055
EP - 2059
BT - 2014 Proceedings of the 22nd European Signal Processing Conference, EUSIPCO 2014
PB - European Signal Processing Conference, EUSIPCO
T2 - 22nd European Signal Processing Conference, EUSIPCO 2014
Y2 - 1 September 2014 through 5 September 2014
ER -