TY - GEN
T1 - Supervised and semi-supervised speech enhancement using weighted nonnegative matrix factorization
AU - Zou, Xia
AU - Hu, Yonggang
AU - Zhang, Xiongwei
N1 - Publisher Copyright:
© 2017 IEEE.
PY - 2017/12/7
Y1 - 2017/12/7
N2 - According to the prior knowledge used in the algorithms, NMF-based speech enhancement can be categorized into supervised (NMF) one and semi-supervised (SNMF) one. In the supervised version, speech is estimated with the prior knowledge of both speech and noise. For the semi-supervised version, speech bases are learned on clean speech data in advance and adapted to different noise conditions. With the probabilistic estimation of whether the speech is present or not in a certain frame, this paper presents an extension to both NMF- and SNMF-based speech enhancement techniques by incorporating the speech presence probability (SPP) into their multiplicative update process. The performance of the algorithms is evaluated on three metrics (PESQ, LSD and SDR) by making experiments on TIMIT with 20 noise types at various signal-to-noise ratio (SNR) levels. All the proposed algorithms show the superiority over the conventional NMF, SNMF-based speech enhancement techniques, and also obtain better performance than some unsupervised state-of-the-art algorithms.
AB - According to the prior knowledge used in the algorithms, NMF-based speech enhancement can be categorized into supervised (NMF) one and semi-supervised (SNMF) one. In the supervised version, speech is estimated with the prior knowledge of both speech and noise. For the semi-supervised version, speech bases are learned on clean speech data in advance and adapted to different noise conditions. With the probabilistic estimation of whether the speech is present or not in a certain frame, this paper presents an extension to both NMF- and SNMF-based speech enhancement techniques by incorporating the speech presence probability (SPP) into their multiplicative update process. The performance of the algorithms is evaluated on three metrics (PESQ, LSD and SDR) by making experiments on TIMIT with 20 noise types at various signal-to-noise ratio (SNR) levels. All the proposed algorithms show the superiority over the conventional NMF, SNMF-based speech enhancement techniques, and also obtain better performance than some unsupervised state-of-the-art algorithms.
KW - Non-negative matrix factorization
KW - semi-supervised NMF
KW - speech presence probability
KW - supervised NMF
UR - http://www.scopus.com/inward/record.url?scp=85046487774&partnerID=8YFLogxK
U2 - 10.1109/WCSP.2017.8170960
DO - 10.1109/WCSP.2017.8170960
M3 - Conference contribution
T3 - 2017 9th International Conference on Wireless Communications and Signal Processing, WCSP 2017 - Proceedings
SP - 1
EP - 5
BT - 2017 9th International Conference on Wireless Communications and Signal Processing, WCSP 2017 - Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 9th International Conference on Wireless Communications and Signal Processing, WCSP 2017
Y2 - 11 October 2017 through 13 October 2017
ER -