TY - JOUR
T1 - Semi-Supervised Multiple Source Localization Using Relative Harmonic Coefficients under Noisy and Reverberant Environments
AU - Hu, Yonggang
AU - Samarasinghe, Prasanga N.
AU - Gannot, Sharon
AU - Abhayapala, Thushara D.
N1 - Publisher Copyright:
© 2014 IEEE.
PY - 2020
Y1 - 2020
N2 - This article develops a semi-supervised algorithm to address the challenging multi-source localization problem in a noisy and reverberant environment, using a spherical harmonics domain source feature of the relative harmonic coefficients. We present a comprehensive research of this source feature, including (i) an illustration confirming its sole dependence on the source position, (ii) a feature estimator in the presence of noise, (iii) a feature selector exploiting its inherent directivity over space. Source features at varied spherical harmonic modes, representing unique characterization of the soundfield, are fused by the Multi-Mode Gaussian Process modeling. Based on the unifying model, we then formulate the mapping function revealing the underlying relationship between the source feature(s) and position(s) using a Bayesian inference approach. Another issue of the overlapped components is addressed by a pre-processing technique performing overlapped frame detection, which in turn reduces this challenging problem to a single source localization. It is highlighted that this data-driven method has a strong potential to be implemented in practice because only a limited number of labeled measurements is required. We evaluate this proposed algorithm using simulated recordings between multiple speakers in diverse environments, and extensive results confirm improved performance in comparison with the state-of-art methods. Additional assessments using real-life recordings further prove the effectiveness of the method, even at unfavorable circumstances with severe source overlapping.
AB - This article develops a semi-supervised algorithm to address the challenging multi-source localization problem in a noisy and reverberant environment, using a spherical harmonics domain source feature of the relative harmonic coefficients. We present a comprehensive research of this source feature, including (i) an illustration confirming its sole dependence on the source position, (ii) a feature estimator in the presence of noise, (iii) a feature selector exploiting its inherent directivity over space. Source features at varied spherical harmonic modes, representing unique characterization of the soundfield, are fused by the Multi-Mode Gaussian Process modeling. Based on the unifying model, we then formulate the mapping function revealing the underlying relationship between the source feature(s) and position(s) using a Bayesian inference approach. Another issue of the overlapped components is addressed by a pre-processing technique performing overlapped frame detection, which in turn reduces this challenging problem to a single source localization. It is highlighted that this data-driven method has a strong potential to be implemented in practice because only a limited number of labeled measurements is required. We evaluate this proposed algorithm using simulated recordings between multiple speakers in diverse environments, and extensive results confirm improved performance in comparison with the state-of-art methods. Additional assessments using real-life recordings further prove the effectiveness of the method, even at unfavorable circumstances with severe source overlapping.
KW - Gaussian Process regression
KW - multi-mode Gaussian Process
KW - relative harmonic coefficients
KW - semi-supervised multiple source localization
KW - source feature estimator
UR - http://www.scopus.com/inward/record.url?scp=85097333311&partnerID=8YFLogxK
U2 - 10.1109/TASLP.2020.3037521
DO - 10.1109/TASLP.2020.3037521
M3 - Article
SN - 2329-9290
VL - 28
SP - 3108
EP - 3123
JO - IEEE/ACM Transactions on Audio Speech and Language Processing
JF - IEEE/ACM Transactions on Audio Speech and Language Processing
M1 - 9259082
ER -