Funnel Deep Complex U-net for Phase-Aware Speech Enhancement <BR>(3 minutes introduction)

Funnel Deep Complex U-net for Phase-Aware Speech Enhancement
(3 minutes introduction)

Yuhang Sun (OPPO, China), Linju Yang (OPPO, China), Huifeng Zhu (OPPO, China), Jie Hao (OPPO, China)

The emergence of deep neural networks has made speech enhancement well developed. Most of the early models focused on estimating the magnitude of spectrum while ignoring the phase, this gives the evaluation result a certain upper limit. Some recent researches proposed deep complex network, which can handle complex inputs, and realize joint estimation of magnitude spectrum and phase spectrum by outputting real and imaginary parts respectively. The encoder-decoder structure in Deep Complex U-net (DCU) has been proven to be effective for complex-valued data. To further improve the performance, in this paper, we design a new network called Funnel Deep Complex U-net (FDCU), which could process magnitude information and phase information separately through one-encoder-two-decoders structure. Moreover, in order to achieve better training effect, we define negative stretched-SI-SNR as the loss function to avoid errors caused by the negative vector angle. Experimental results show that our FDCU model outperforms state-of-the-art approaches in all evaluation metrics.

Search in Audio

Related Recordings

Perceptual Contributions of Vowels and Consonant-vowel Transitions in Understanding Time-compressed Mandarin Sentences
(3 minutes introduction)

Changjie Pan , Feng Yang , Fei Chen

Speech Enhancement with Weakly Labelled Data from AudioSet
(3 minutes introduction)

Qiuqiang Kong , Haohe Liu , Xingjian Du , Li Chen , Rui Xia , Yuxuan Wang

InterSpeech 2021

Funnel Deep Complex U-net for Phase-Aware Speech Enhancement (3 minutes introduction)

Search in Audio

Related Recordings

Perceptual Contributions of Vowels and Consonant-vowel Transitions in Understanding Time-compressed Mandarin Sentences (3 minutes introduction)

Speech Enhancement with Weakly Labelled Data from AudioSet (3 minutes introduction)

Funnel Deep Complex U-net for Phase-Aware Speech Enhancement
(3 minutes introduction)

Perceptual Contributions of Vowels and Consonant-vowel Transitions in Understanding Time-compressed Mandarin Sentences
(3 minutes introduction)

Speech Enhancement with Weakly Labelled Data from AudioSet
(3 minutes introduction)