Combating Reverberation in NTF-based Speech Separation using a Sub-Source Weighted Multichannel Wiener Filter and Linear Prediction <BR>(Oral presentation)

Combating Reverberation in NTF-based Speech Separation using a Sub-Source Weighted Multichannel Wiener Filter and Linear Prediction
(Oral presentation)

Mieszko Fraś (AGH UST, Poland), Marcin Witkowski (AGH UST, Poland), Konrad Kowalczyk (AGH UST, Poland)

Sound source separation (SS) from the microphone signals capturing speech in reverberant conditions is a formidable task. This paper addresses the problem of joint separation and dereverberation of speech using the multichannel Wiener filter (MWF) that is tailored to the sub-source modeling of each speech source with a full-rank mixing matrix. Specifically, the parameters of the proposed sub-source-weighted (SSW) spatial filter are estimated using the sub-source based expectation maximization (EM) algorithm with multiplicative updates (MU) and the localization prior distribution (LP) on the mixing matrix (SSEM-MU-LP). In addition, we strengthen dereverberation by incorporating a Generalized Weighted Prediction Error (GWPE) algorithm. The proposed method is evaluated using a large dataset of two-channel recordings of clean speech convolved with both real and synthesized impulse responses. The results of the experiments show the superior performance of the proposed method in reverberant conditions in comparison to using the standard NTF-based separation with the vanilla MWF in terms of signal-to-distortion ratio (improvement of 3–5.6 dB) and other commonly used sound separation metrics.

Search in Audio

Related Recordings

A Hands-on Comparison of DNNs for Dialog Separation Using Transfer Learning from Music Source Separation
(Oral presentation)

Martin Strauss , Jouni Paulus , Matteo Torcoli , Bernd Edler

GlobalPhone Mix-to-Separate out of 2: A Multilingual 2000 Speakers Mixtures Database for Speech Separation
(Oral presentation)

Marvin Borsdorf , Chenglin Xu , Haizhou Li , Tanja Schultz

InterSpeech 2021

Combating Reverberation in NTF-based Speech Separation using a Sub-Source Weighted Multichannel Wiener Filter and Linear Prediction (Oral presentation)

Search in Audio

Related Recordings

A Hands-on Comparison of DNNs for Dialog Separation Using Transfer Learning from Music Source Separation (Oral presentation)

GlobalPhone Mix-to-Separate out of 2: A Multilingual 2000 Speakers Mixtures Database for Speech Separation (Oral presentation)

Combating Reverberation in NTF-based Speech Separation using a Sub-Source Weighted Multichannel Wiener Filter and Linear Prediction
(Oral presentation)

A Hands-on Comparison of DNNs for Dialog Separation Using Transfer Learning from Music Source Separation
(Oral presentation)

GlobalPhone Mix-to-Separate out of 2: A Multilingual 2000 Speakers Mixtures Database for Speech Separation
(Oral presentation)