InterSpeech 2021

Speech Synthesis: tools, data, evaluation

Spectral and Latent Speech Representation Distortion for TTS Evaluation
(3 minutes introduction)

Thananchai Kongthaworn (Chulalongkorn University, Thailand), Burin Naowarat (Chulalongkorn University, Thailand), Ekapol Chuangsuwanich (Chulalongkorn University, Thailand)

RyanSpeech: A Corpus for Conversational Text-to-Speech Synthesis
(3 minutes introduction)

Rohola Zandie (University of Denver, USA), Mohammad H. Mahoor (University of Denver, USA), Julia Madsen (DreamFace Technologies, USA), Eshrat S. Emamian (DreamFace Technologies, USA)

Comparing Speech Enhancement Techniques for Voice Adaptation-Based Speech Synthesis
(3 minutes introduction)

Nicholas Eng (University of Auckland, New Zealand), C.T. Justine Hui (University of Auckland, New Zealand), Yusuke Hioka (University of Auckland, New Zealand), Catherine I. Watson (University of Auckland, New Zealand)

Perception of social speaker characteristics in synthetic speech
(3 minutes introduction)

Sai Sirisha Rallabandi (Technische Universität Berlin, Germany), Abhinav Bharadwaj (Technische Universität Berlin, Germany), Babak Naderi (Technische Universität Berlin, Germany), Sebastian Möller (Technische Universität Berlin, Germany)

Hi-Fi Multi-Speaker English TTS Dataset
(3 minutes introduction)

Evelina Bakhturina (NVIDIA, USA), Vitaly Lavrukhin (NVIDIA, USA), Boris Ginsburg (NVIDIA, USA), Yang Zhang (NVIDIA, USA)

Utilizing Self-supervised Representations for MOS Prediction
(3 minutes introduction)

Wei-Cheng Tseng (National Taiwan University, Taiwan), Chien-yu Huang (National Taiwan University, Taiwan), Wei-Tsung Kao (National Taiwan University, Taiwan), Yist Y. Lin (National Taiwan University, Taiwan), Hung-yi Lee (National Taiwan University, Taiwan)

Utilizing Self-supervised Representations for MOS Prediction
(longer introduction)

Wei-Cheng Tseng (National Taiwan University, Taiwan), Chien-yu Huang (National Taiwan University, Taiwan), Wei-Tsung Kao (National Taiwan University, Taiwan), Yist Y. Lin (National Taiwan University, Taiwan), Hung-yi Lee (National Taiwan University, Taiwan)

KazakhTTS: An Open-Source Kazakh Text-to-Speech Synthesis Dataset
(3 minutes introduction)

Saida Mussakhojayeva (Nazarbayev University, Kazakhstan), Aigerim Janaliyeva (Nazarbayev University, Kazakhstan), Almas Mirzakhmetov (Nazarbayev University, Kazakhstan), Yerbolat Khassanov (Nazarbayev University, Kazakhstan), Huseyin Atakan Varol (Nazarbayev University, Kazakhstan)