Towards automatic speech recognition for people with atypical speech

Heidi Christensen (University of Sheffield)

Abstract In the last decade we have seen how speech technologies for typical speech have matured and thus enabled the advancement of a multitude of services and technologies including voice-enabled conversational interfaces, dictation and successfully underpinning the use of state-of-the-art NLP techniques. This ever more pervasive offering allows for an often far more convenient and natural way of interacting with machines and systems. However it also represents an ever-growing gap experienced by people with atypical (dysarthric) voices: people with even just mild-to-moderate speech disorders cannot achieve satisfactory performance with current automatic speech recognition (ASR) systems and hence they are falling further and further behind in terms of their ability to use modern devices and interfaces. This talk will present the major challenges in porting mainstream ASR methodologies to work for atypical speech, discuss recent advances and present thoughts on where the research effort should be focusing to have real impact in this community of potential users. Being able to speak a query or dictate an email offers a lot of convenience to most of us but for this group of people can have significant implications on ability to fully take part in society and life quality. Bio Dr Heidi Christensen is a Senior Lecturer in Computer Science at the University of Sheffield, United Kingdom. Her research interests are on the application of AI-based voice technologies to healthcare and focus on two main areas: i) the automatic recognition of atypical speech and ii) the detection and monitoring of people’s physical and mental health including verbal and non-verbal traits for expressions of emotion, anxiety, depression and neurodegenerative conditions in e.g., therapeutic or diagnostic settings.

InterSpeech 2021

Towards automatic speech recognition for people with atypical speech

Search in Audio

Related Recordings

Uncovering the acoustic cues of COVID-19 infection

Learning speech models from multi-modal data