Supervised and semi-supervised speech enhancement using weighted nonnegative matrix factorization

Xia Zou, Yonggang Hu, Xiongwei Zhang

    Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review


    According to the prior knowledge used in the algorithms, NMF-based speech enhancement can be categorized into supervised (NMF) one and semi-supervised (SNMF) one. In the supervised version, speech is estimated with the prior knowledge of both speech and noise. For the semi-supervised version, speech bases are learned on clean speech data in advance and adapted to different noise conditions. With the probabilistic estimation of whether the speech is present or not in a certain frame, this paper presents an extension to both NMF- and SNMF-based speech enhancement techniques by incorporating the speech presence probability (SPP) into their multiplicative update process. The performance of the algorithms is evaluated on three metrics (PESQ, LSD and SDR) by making experiments on TIMIT with 20 noise types at various signal-to-noise ratio (SNR) levels. All the proposed algorithms show the superiority over the conventional NMF, SNMF-based speech enhancement techniques, and also obtain better performance than some unsupervised state-of-the-art algorithms.

    Original languageEnglish
    Title of host publication2017 9th International Conference on Wireless Communications and Signal Processing, WCSP 2017 - Proceedings
    PublisherInstitute of Electrical and Electronics Engineers Inc.
    Number of pages5
    ISBN (Electronic)9781538620625
    Publication statusPublished - 7 Dec 2017
    Event9th International Conference on Wireless Communications and Signal Processing, WCSP 2017 - Nanjing, China
    Duration: 11 Oct 201713 Oct 2017

    Publication series

    Name2017 9th International Conference on Wireless Communications and Signal Processing, WCSP 2017 - Proceedings


    Conference9th International Conference on Wireless Communications and Signal Processing, WCSP 2017


    Dive into the research topics of 'Supervised and semi-supervised speech enhancement using weighted nonnegative matrix factorization'. Together they form a unique fingerprint.

    Cite this