InterSpeech 2021

Topics in ASR: Adaptation, transfer learning, children's speech, and low-resource settings

Semantic Data Augmentation for End-to-End Mandarin Speech Recognition
(3 minutes introduction)

Jianwei Sun (KE, China), Zhiyuan Tang (KE, China), Hengxin Yin (KE, China), Wei Wang (KE, China), Xi Zhao (KE, China), Shuaijiang Zhao (KE, China), Xiaoning Lei (KE, China), Wei Zou (KE, China), Xiangang Li (KE, China)

Low Resource German ASR with Untranscribed Data Spoken by Non-native Children - INTERSPEECH 2021 Shared Task SPAPL System
(3 minutes introduction)

Jinhan Wang (University of California at Los Angeles, USA), Yunzheng Zhu (University of California at Los Angeles, USA), Ruchao Fan (University of California at Los Angeles, USA), Wei Chu (PAII, USA), Abeer Alwan (University of California at Los Angeles, USA)

Low Resource German ASR with Untranscribed Data Spoken by Non-native Children - INTERSPEECH 2021 Shared Task SPAPL System
(longer introduction)

Jinhan Wang (University of California at Los Angeles, USA), Yunzheng Zhu (University of California at Los Angeles, USA), Ruchao Fan (University of California at Los Angeles, USA), Wei Chu (PAII, USA), Abeer Alwan (University of California at Los Angeles, USA)

Speaker normalization using Joint Variational Autoencoder
(3 minutes introduction)

Shashi Kumar (Samsung, India), Shakti P. Rath (Reverie Language Technologies, India), Abhishek Pandey (Samsung, India)

The TAL system for the INTERSPEECH2021 Shared Task on Automatic Speech Recognition for Non-Native Children’s Speech
(3 minutes introduction)

Gaopeng Xu (TAL, China), Song Yang (TAL, China), Lu Ma (TAL, China), Chengfei Li (TAL, China), Zhongqin Wu (TAL, China)

The TAL system for the INTERSPEECH2021 Shared Task on Automatic Speech Recognition for Non-Native Children’s Speech
(longer introduction)

Gaopeng Xu (TAL, China), Song Yang (TAL, China), Lu Ma (TAL, China), Chengfei Li (TAL, China), Zhongqin Wu (TAL, China)

Zero-shot Cross-Lingual Phonetic Recognition with External Language Embedding
(3 minutes introduction)

Heting Gao (University of Illinois at Urbana-Champaign, USA), Junrui Ni (University of Illinois at Urbana-Champaign, USA), Yang Zhang (MIT-IBM Watson AI Lab, USA), Kaizhi Qian (MIT-IBM Watson AI Lab, USA), Shiyu Chang (MIT-IBM Watson AI Lab, USA), Mark Hasegawa-Johnson (University of Illinois at Urbana-Champaign, USA)

Best of Both Worlds: Robust Accented Speech Recognition with Adversarial Transfer Learning
(3 minutes introduction)

Nilaksh Das (Georgia Tech, USA), Sravan Bodapati (Amazon, USA), Monica Sunkara (Amazon, USA), Sundararajan Srinivasan (Amazon, USA), Duen Horng Chau (Georgia Tech, USA)