InterSpeech 2021

Novel neural network architectures for ASR

Dynamic Encoder Transducer: A Flexible Solution For Trading Off Accuracy For Latency
(3 minutes introduction)

Yangyang Shi (Facebook, USA), Varun Nagaraja (Facebook, USA), Chunyang Wu (Facebook, USA), Jay Mahadeokar (Facebook, USA), Duc Le (Facebook, USA), Rohit Prabhavalkar (Facebook, USA), Alex Xiao (Facebook, USA), Ching-Feng Yeh (Facebook, USA), Julian Chan (Facebook, USA), Christian Fuegen (Facebook, USA), Ozlem Kalinli (Facebook, USA), Michael L. Seltzer (Facebook, USA)

Librispeech Transducer Model with Internal Language Model Prior Correction
(3 minutes introduction)

Albert Zeyer (RWTH Aachen University, Germany), André Merboldt (RWTH Aachen University, Germany), Wilfried Michel (RWTH Aachen University, Germany), Ralf Schlüter (RWTH Aachen University, Germany), Hermann Ney (RWTH Aachen University, Germany)

A Deliberation-based Joint Acoustic and Text Decoder
(3 minutes introduction)

Sepand Mavandadi (Google, USA), Tara N. Sainath (Google, USA), Ke Hu (Google, USA), Zelin Wu (Google, USA)

On the limit of English conversational speech recognition
(3 minutes introduction)

Zoltán Tüske (IBM, USA), George Saon (IBM, USA), Brian Kingsbury (IBM, USA)

SpeechMoE: Scaling to Large Acoustic Models with Dynamic Routing Mixture of Experts
(3 minutes introduction)

Zhao You (Tencent, China), Shulin Feng (Tencent, China), Dan Su (Tencent, China), Dong Yu (Tencent, USA)

Online Compressive Transformer for End-to-End Speech Recognition
(3 minutes introduction)

Chi-Hang Leong (NYCU, Taiwan), Yu-Han Huang (NYCU, Taiwan), Jen-Tzung Chien (NYCU, Taiwan)

A Comparative Study on Neural Architectures and Training Methods for Japanese Speech Recognition
(3 minutes introduction)

Shigeki Karita (Google, Japan), Yotaro Kubo (Google, Japan), Michiel Adriaan Unico Bacchiani (Google, Japan), Llion Jones (Google, Japan)