InterSpeech 2021

Speech coding and privacy

NU-Wave: A Diffusion Probabilistic Model for Neural Audio Upsampling
(3 minutes introduction)

Junhyeok Lee (MINDs Lab, Korea), Seungu Han (MINDs Lab, Korea)

NU-Wave: A Diffusion Probabilistic Model for Neural Audio Upsampling
(longer introduction)

Junhyeok Lee (MINDs Lab, Korea), Seungu Han (MINDs Lab, Korea)

WSRGlow: A Glow-based Waveform Generative Model for Audio Super-Resolution
(3 minutes introduction)

Kexun Zhang (Zhejiang University, China), Yi Ren (Zhejiang University, China), Changliang Xu (Xinhua News Agency, China), Zhou Zhao (Zhejiang University, China)

Multi-channel Opus compression for far-field automatic speech recognition with a fixed bitrate budget
(longer introduction)

Lukas Drude (Amazon, Germany), Jahn Heymann (Amazon, Germany), Andreas Schwarz (Amazon, Germany), Jean-Marc Valin (Amazon, USA)

Improving the expressiveness of neural vocoding with non-affine Normalizing Flows
(3 minutes introduction)

Adam Gabryś (Amazon, Poland), Yunlong Jiao (Amazon, UK), Viacheslav Klimkov (Amazon, Germany), Daniel Korzekwa (Amazon, Poland), Roberto Barra-Chicote (Amazon, UK)

A Two-stage Approach to Speech Bandwidth Extension
(3 minutes introduction)

Ju Lin (Clemson University, USA), Yun Wang (Facebook, USA), Kaustubh Kalgaonkar (Facebook, USA), Gil Keren (Facebook, USA), Didi Zhang (Facebook, USA), Christian Fuegen (Facebook, USA)

Development of a Psychoacoustic Loss Function for the Deep Neural Network (DNN)-Based Speech Coder
(3 minutes introduction)

Joon Byun (Yonsei University, Korea), Seungmin Shin (Yonsei University, Korea), Youngcheol Park (Yonsei University, Korea), Jongmo Sung (ETRI, Korea), Seungkwon Beack (ETRI, Korea)

Protecting gender and identity with disentangled speech representations
(3 minutes introduction)

Dimitrios Stoidis (Queen Mary University of London, UK), Andrea Cavallaro (Queen Mary University of London, UK)