InterSpeech 2021

Funnel Deep Complex U-net for Phase-Aware Speech Enhancement
(3 minutes introduction)

Yuhang Sun (OPPO, China), Linju Yang (OPPO, China), Huifeng Zhu (OPPO, China), Jie Hao (OPPO, China)
The emergence of deep neural networks has made speech enhancement well developed. Most of the early models focused on estimating the magnitude of spectrum while ignoring the phase, this gives the evaluation result a certain upper limit. Some recent researches proposed deep complex network, which can handle complex inputs, and realize joint estimation of magnitude spectrum and phase spectrum by outputting real and imaginary parts respectively. The encoder-decoder structure in Deep Complex U-net (DCU) has been proven to be effective for complex-valued data. To further improve the performance, in this paper, we design a new network called Funnel Deep Complex U-net (FDCU), which could process magnitude information and phase information separately through one-encoder-two-decoders structure. Moreover, in order to achieve better training effect, we define negative stretched-SI-SNR as the loss function to avoid errors caused by the negative vector angle. Experimental results show that our FDCU model outperforms state-of-the-art approaches in all evaluation metrics.