InterSpeech 2021

Low-resource speech recognition

Towards unsupervised phone and word segmentation using self-supervised vector-quantized neural networks
(3 minutes introduction)

Herman Kamper (Stellenbosch University, South Africa), Benjamin van Niekerk (Stellenbosch University, South Africa)

Speech SimCLR: Combining Contrastive and Reconstruction Objective for Self-supervised Speech Representation Learning
(3 minutes introduction)

Dongwei Jiang (YuanFuDao, China), Wubo Li (DiDi Chuxing, China), Miao Cao (DiDi Chuxing, China), Wei Zou (DiDi Chuxing, China), Xiangang Li (DiDi Chuxing, China)

Speech SimCLR: Combining Contrastive and Reconstruction Objective for Self-supervised Speech Representation Learning
(longer introduction)

Dongwei Jiang (YuanFuDao, China), Wubo Li (DiDi Chuxing, China), Miao Cao (DiDi Chuxing, China), Wei Zou (DiDi Chuxing, China), Xiangang Li (DiDi Chuxing, China)

Multilingual transfer of acoustic word embeddings improves when training on languages related to the target zero-resource language
(3 minutes introduction)

Christiaan Jacobs (Stellenbosch University, South Africa), Herman Kamper (Stellenbosch University, South Africa)

Analyzing Speaker Information in Self-Supervised Models to Improve Zero-Resource Speech Processing
(3 minutes introduction)

Benjamin van Niekerk (Stellenbosch University, South Africa), Leanne Nortje (Stellenbosch University, South Africa), Matthew Baas (Stellenbosch University, South Africa), Herman Kamper (Stellenbosch University, South Africa)

The Zero Resource Speech Challenge 2021: Spoken language modelling
(3 minutes introduction)

Ewan Dunbar (University of Toronto, Canada), Mathieu Bernard (LSCP (UMR 8554), France), Nicolas Hamilakis (LSCP (UMR 8554), France), Tu Anh Nguyen (LSCP (UMR 8554), France), Maureen de Seyssel (LSCP (UMR 8554), France), Patricia Rozé (LSCP (UMR 8554), France), Morgane Rivière (Facebook, France), Eugene Kharitonov (Facebook, France), Emmanuel Dupoux (LSCP (UMR 8554), France)

Zero-Shot Federated Learning with New Classes for Audio Classification
(3 minutes introduction)

Gautham Krishna Gudur (Ericsson, India), Satheesh Kumar Perepu (Ericsson, India)