Handling acoustic variation in dysarthric speech recognition systems through model combination <BR>(Oral presentation)

Handling acoustic variation in dysarthric speech recognition systems through model combination
(Oral presentation)

Enno Hermann (Idiap Research Institute, Switzerland), Mathew Magimai-Doss (Idiap Research Institute, Switzerland)

Developing automatic speech recognition (ASR) systems that recognise dysarthric speech as well as control speech from unimpaired speakers remains challenging. Including more highly variable dysarthric speech during training can also negatively affect the performance on control speakers, which is not desirable when developing speech recognisers for a wider audience. In this work, we analyse how the acoustic variability of dysarthric speech affects ASR systems and propose the combination of multiple acoustic models trained on different subsets of speakers to mitigate this effect. This approach shows improvements for both dysarthric and control speakers on the Torgo and UA-Speech corpora.

Search in Audio

Related Recordings

Investigating the Utility of Multimodal Conversational Technology and Audiovisual Analytic Measures for the Assessment and Monitoring of Amyotrophic Lateral Sclerosis at Scale
(Oral presentation)

Michael Neumann , Oliver Roesler , Jackson Liscombe , Hardik Kothare , David Suendermann-Oeft , David Pautler , Indu Navar , Aria Anvar , Jochen Kumm , Raquel Norel , Ernest Fraenkel , Alexander V. Sherman , James D. Berry , Gary L. Pattee , Jun Wang , Jordan R. Green , Vikram Ramanarayanan

Adversarial Data Augmentation for Disordered Speech Recognition
(Oral presentation)

Zengrui Jin , Mengzhe Geng , Xurong Xie , Jianwei Yu , Shansong Liu , Xunying Liu , Helen Meng

InterSpeech 2021

Handling acoustic variation in dysarthric speech recognition systems through model combination (Oral presentation)

Search in Audio

Related Recordings

Investigating the Utility of Multimodal Conversational Technology and Audiovisual Analytic Measures for the Assessment and Monitoring of Amyotrophic Lateral Sclerosis at Scale (Oral presentation)

Adversarial Data Augmentation for Disordered Speech Recognition (Oral presentation)

Handling acoustic variation in dysarthric speech recognition systems through model combination
(Oral presentation)

Investigating the Utility of Multimodal Conversational Technology and Audiovisual Analytic Measures for the Assessment and Monitoring of Amyotrophic Lateral Sclerosis at Scale
(Oral presentation)

Adversarial Data Augmentation for Disordered Speech Recognition
(Oral presentation)