Single-channel speech dereverberation in noisy environment for non-orthogonal signals

    Research output: Contribution to journalArticlepeer-review

    Abstract

    The detrimental effect of speech reverberation reduces speech quality, limits the performance of automatic speech recognition systems and impairs hearing aids. Spectral enhancement (SE) is a popular method for suppressing the late reverberation and background noise. However, conventional SE-based approaches assume orthogonality between the desired and undesired signal components. This orthogonality assumption does not hold true in most of the practical cases due to a limited time-domain support and the short-time stationarity of the speech signals, and thereby, affects estimation accuracy. To circumvent this issue, Lu et al. relaxed the orthogonality assumption by proposing a geometric approach to spectral subtraction (GSS) and evaluated their algorithm against different kinds of background noise. In our work, we comprehensively analyze the model by virtue of a simplified GSS transfer function to gain an insight into the algorithm. We conduct a series of experiments to validate GSS and explore its limitations in diverse realistic scenarios with both reverberation and background noise through a comprehensive end-to-end system for speech dereverberation and noise suppression. We also analyze the performance of GSS using the experimental data of the 2014 REVERB challenge and compare it with other conventional approaches such as spectral subtraction, Wiener Filter, minimum mean square error short-time spectral amplitude estimator and log spectral amplitude estimator, as well as with the contemporary methods of the 2014 REVERB challenge.

    Original languageEnglish
    Pages (from-to)1041-1055
    Number of pages15
    JournalActa Acustica united with Acustica
    Volume104
    Issue number6
    DOIs
    Publication statusPublished - 1 Nov 2018

    Fingerprint

    Dive into the research topics of 'Single-channel speech dereverberation in noisy environment for non-orthogonal signals'. Together they form a unique fingerprint.

    Cite this