TY - GEN
T1 - Binaural localization of speech sources in 3-D using a composite feature vector of the HRTF
AU - Wu, Xiang
AU - Talagala, Dumidu S.
AU - Zhang, Wen
AU - Abhayapala, Thushara D.
N1 - Publisher Copyright:
© 2015 IEEE.
PY - 2015/8/4
Y1 - 2015/8/4
N2 - Binaural localization of speech sources in 3-D, using head-related transfer functions (HRTFs), always suffers elevation ambiguity due to the limited high frequency spectral information available at the receivers. This paper presents a method that overcomes this limitation by exploiting the interaural phase and magnitude features present in the HRTF. We (i) introduce a new feature vector that combines these two sets of features in a non-linear fashion, and (ii) propose a mechanism to extract this feature vector free from distortion by the speech spectra. The performance of the proposed method is evaluated and compared with a correlation-based HRTF database matching approach and a two-step localization technique for multiple source positions, HRTFs (individuals) and speech inputs. The results suggest that up to 20% improvement in localization performance can be achieved for moderate signal-to-noise ratios.
AB - Binaural localization of speech sources in 3-D, using head-related transfer functions (HRTFs), always suffers elevation ambiguity due to the limited high frequency spectral information available at the receivers. This paper presents a method that overcomes this limitation by exploiting the interaural phase and magnitude features present in the HRTF. We (i) introduce a new feature vector that combines these two sets of features in a non-linear fashion, and (ii) propose a mechanism to extract this feature vector free from distortion by the speech spectra. The performance of the proposed method is evaluated and compared with a correlation-based HRTF database matching approach and a two-step localization technique for multiple source positions, HRTFs (individuals) and speech inputs. The results suggest that up to 20% improvement in localization performance can be achieved for moderate signal-to-noise ratios.
KW - Binaural localization
KW - cepstral transformation
KW - generalized cross-correlation (GCC)
KW - head related transfer function (HRTF)
KW - phase transform (PHAT)
UR - http://www.scopus.com/inward/record.url?scp=84946098067&partnerID=8YFLogxK
U2 - 10.1109/ICASSP.2015.7178452
DO - 10.1109/ICASSP.2015.7178452
M3 - Conference contribution
T3 - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
SP - 2654
EP - 2658
BT - 2015 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2015 - Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 40th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2015
Y2 - 19 April 2014 through 24 April 2014
ER -