InterSpeech 2021

Resource-constrained ASR

Compressing 1D Time-Channel Separable Convolutions using Sparse Random Ternary Matrices
(3 minutes introduction)

Gonçalo Mordido (HPI, Germany), Matthijs Van keirsbilck (NVIDIA, Germany), Alexander Keller (NVIDIA, Germany)

Compressing 1D Time-Channel Separable Convolutions using Sparse Random Ternary Matrices
(longer introduction)

Gonçalo Mordido (HPI, Germany), Matthijs Van keirsbilck (NVIDIA, Germany), Alexander Keller (NVIDIA, Germany)

Weakly Supervised Construction of ASR Systems from Massive Video Data
(3 minutes introduction)

Mengli Cheng (Alibaba, China), Chengyu Wang (Alibaba, China), Jun Huang (Alibaba, China), Xiaobo Wang (Alibaba, China)

Weakly Supervised Construction of ASR Systems from Massive Video Data
(longer introduction)

Mengli Cheng (Alibaba, China), Chengyu Wang (Alibaba, China), Jun Huang (Alibaba, China), Xiaobo Wang (Alibaba, China)

Extremely Low Footprint End-to-End ASR System for Smart Device
(3 minutes introduction)

Zhifu Gao (Alibaba, China), Yiwu Yao (Alibaba, China), Shiliang Zhang (Alibaba, China), Jun Yang (Alibaba, China), Ming Lei (Alibaba, China), Ian McLoughlin (SIT, Singapore)

Tied \& Reduced RNN-T Decoder
(3 minutes introduction)

Rami Botros (Google, USA), Tara N. Sainath (Google, USA), Robert David (Google, USA), Emmanuel Guzman (Google, USA), Wei Li (Google, USA), Yanzhang He (Google, USA)

PQK: Model Compression via Pruning, Quantization, and Knowledge Distillation
(3 minutes introduction)

Jangho Kim (Qualcomm, Korea), Simyung Chang (Qualcomm, Korea), Nojun Kwak (Seoul National University, Korea)

Collaborative Training of Acoustic Encoders for Speech Recognition
(3 minutes introduction)

Varun Nagaraja (Facebook, USA), Yangyang Shi (Facebook, USA), Ganesh Venkatesh (Facebook, USA), Ozlem Kalinli (Facebook, USA), Michael L. Seltzer (Facebook, USA), Vikas Chandra (Facebook, USA)