InterSpeech 2021

InterSpeech 2021

INTERSPEECH is the world’s largest and most comprehensive conference on the science and technology of spoken language processing. INTERSPEECH conferences emphasize interdisciplinary approaches addressing all aspects of speech science and technology, ranging from basic theories to advanced applications.

The theme of INTERSPEECH 2021 held in Brno, Czechia, is Speech everywhere. Speech is also becoming an indispensable part of all AI systems and no longer considered an isolated block. We are seeing the emergence of larger systems that treat speech, vision, language, interfaces, external knowledge in an integrated way, and learn multi-modal embeddings, or otherwise jointly optimize performance. Speech everywhere also requires speech engineering to become more aware of the principles of human speech communication processes, and we therefore specifically encourage contributions in human speech processing.

In addition to regular oral and poster sessions, INTERSPEECH 2021 featured plenary talks by internationally renowned experts, tutorials, special sessions and challenges, show & tell sessions, and exhibits. A number of satellite events took place around INTERSPEECH 2021.

Website: www.interspeech2021.org, YouTube

Keynotes

Number of Recordings: 4

Survey talks

Number of Recordings: 4

Acoustic event detection and acoustic scene classification

Number of Recordings: 5

Applications in transcription, education and learning

Number of Recordings: 8

ASR Technologies and systems

Number of Recordings: 1

Assessment of pathological speech and language I

Number of Recordings: 4

Assessment of pathological speech and language II

Number of Recordings: 13

Automatic Speech Recognition in Air Traffic Management

Number of Recordings: 4

Communication and interaction, multimodality

Number of Recordings: 8

ConferencingSpeech 2021 challenge: Far-field Multi-Channel Speech Enhancement for Video Conferencing

Number of Recordings: 5

Cross/multi-lingual and code-switched ASR

Number of Recordings: 7

Disordered speech

Number of Recordings: 3

Diverse modes of speech acquisition and processing

Number of Recordings: 10

Embedding and Network Architecture for Speaker Recognition

Number of Recordings: 3

Emotion and Sentiment Analysis I

Number of Recordings: 2

Emotion and Sentiment Analysis II

Number of Recordings: 9

Emotion and Sentiment Analysis III

Number of Recordings: 4

Feature, Embedding and Neural Architecture for Speaker Recognition

Number of Recordings: 8

Graph and End-to-End Learning for Speaker Recognition

Number of Recordings: 1

Health and Affect I

Number of Recordings: 3

Health and Affect II

Number of Recordings: 9

INTERSPEECH 2021 Acoustic Echo Cancellation Challenge

Number of Recordings: 3

INTERSPEECH 2021 Deep Noise Suppression Challenge

Number of Recordings: 2

Keyword search and spoken language processing

Number of Recordings: 3

Language and Accent Recognition

Number of Recordings: 3

Language and Lexical Modeling for ASR

Number of Recordings: 8

Language Modeling and Text-based Innovations for ASR

Number of Recordings: 3

Linguistic Components in end-to-end ASR

Number of Recordings: 5

Low-resource speech recognition

Number of Recordings: 7

Miscellanous topics in ASR

Number of Recordings: 3

Multi- and cross-lingual ASR, other topics in ASR

Number of Recordings: 8

Multi-channel speech enhancement and hearing aids

Number of Recordings: 9

Multimodal systems

Number of Recordings: 10

Neural Network Training Methods and Architectures for ASR

Number of Recordings: 4

Neural network training methods for ASR

Number of Recordings: 9

Non-Autoregressive Sequential Modeling for Speech Processing

Number of Recordings: 7

Non-native speech

Number of Recordings: 5

Novel neural network architectures for ASR

Number of Recordings: 8

OpenASR20 and Low Resource ASR Development

Number of Recordings: 3

Oriental Language Recognition

Number of Recordings: 3

Phonation and voicing

Number of Recordings: 4

Phonetics I

Number of Recordings: 1

Phonetics II

Number of Recordings: 11

Privacy-preserving Machine Learning for Audio & Speech Processing

Number of Recordings: 9

Prosodic features and structure

Number of Recordings: 8

Resource-constrained ASR

Number of Recordings: 8

Robust and Far-field ASR

Number of Recordings: 3

Robust Speaker Recognition

Number of Recordings: 8

SdSV Challenge 2021: Analysis and Exploration of New Ideas on Short-Duration Speaker Verification

Number of Recordings: 2

Search/decoding techniques and confidence measures for ASR

Number of Recordings: 6

Self-supervision and semi-supervision for neural ASR training

Number of Recordings: 5

Show and Tell 1

Number of Recordings: 5

Show and Tell 2

Number of Recordings: 5

Show and Tell 3

Number of Recordings: 7

Show and Tell 4

Number of Recordings: 7

Single-channel speech enhancement

Number of Recordings: 7

Source Separation I

Number of Recordings: 2

Source Separation II

Number of Recordings: 10

Source Separation III

Number of Recordings: 3

Source separation, dereverberation and echo cancellation

Number of Recordings: 3

Speaker Diarization I

Number of Recordings: 3

Speaker Diarization II

Number of Recordings: 9

Speaker Recognition: Applications

Number of Recordings: 9

Speaker, Language, and Privacy

Number of Recordings: 3

Speech and audio analysis

Number of Recordings: 4

Speech coding and privacy

Number of Recordings: 9

Speech enhancement and coding

Number of Recordings: 2

Speech enhancement and intelligibility

Number of Recordings: 12

Speech Localization, Enhancement, and Quality Assessment

Number of Recordings: 4

Speech perception I

Number of Recordings: 2

Speech perception II

Number of Recordings: 9

Speech production I

Number of Recordings: 4

Speech production II

Number of Recordings: 6

Speech Recognition of Atypical Speech

Number of Recordings: 11

Speech signal analysis and representation I

Number of Recordings: 12

Speech signal analysis and representation II

Number of Recordings: 4

Speech Synthesis: Linguistic processing, paradigms and other topics

Number of Recordings: 8

Speech Synthesis: Neural Waveform Generation

Number of Recordings: 6

Speech Synthesis: Other topics I

Number of Recordings: 4

Speech Synthesis: Prosody Modeling I

Number of Recordings: 6

Speech Synthesis: Prosody Modeling II

Number of Recordings: 3

Speech Synthesis: Singing, Multimodal, Crosslingual Synthesis

Number of Recordings: 8

Speech Synthesis: Speaking Style and Emotion

Number of Recordings: 7

Speech Synthesis: tools, data, evaluation

Number of Recordings: 8

Speech Synthesis: Toward End-to-End Synthesis I

Number of Recordings: 7

Speech Synthesis: Toward End-to-End Synthesis II

Number of Recordings: 8

Speech type classification and diagnosis

Number of Recordings: 8

Spoken Dialogue Systems I

Number of Recordings: 2

Spoken Dialogue Systems II

Number of Recordings: 5

Spoken Language Processing I

Number of Recordings: 7

Spoken Language Processing II

Number of Recordings: 2

Spoken Language Understanding I

Number of Recordings: 8

Spoken Language Understanding II

Number of Recordings: 3

Spoken machine translation

Number of Recordings: 12

Spoken Term Detection & Voice Search

Number of Recordings: 9

Streaming for ASR/RNN Transducers

Number of Recordings: 7

Target speaker detection, localization and separation

Number of Recordings: 5

The ADReSSo Challenge: Detecting cognitive decline using speech only

Number of Recordings: 7

The First DiCOVA Challenge: Diagnosis of COVid-19 using Acoustics

Number of Recordings: 6

The INTERSPEECH 2021 Computational Paralinguistics Challenge (ComParE) - COVID-19 Cough, COVID-19 Speech, Escalation & Primates

Number of Recordings: 8

Tools, corpora and resources

Number of Recordings: 11

Topics in ASR: Adaptation, transfer learning, children's speech, and low-resource settings

Number of Recordings: 9

Topics in ASR: Robustness, feature extraction, and far-field ASR

Number of Recordings: 8

Tutorials

Number of Recordings: 8

Voice activity detection

Number of Recordings: 5

Voice activity detection and keyword spotting

Number of Recordings: 10

Voice and voicing

Number of Recordings: 6

Voice Anti-Spoofing and Countermeasure

Number of Recordings: 11

Voice Conversion and Adaptation I

Number of Recordings: 7

Voice Conversion and Adaptation II

Number of Recordings: 4

Voice quality characterization for clinical voice assessment: Voice production, acoustics, and auditory perception

Number of Recordings: 4

Opening

Number of Recordings: 1

Closing

Number of Recordings: 3