Investigating the Utility of Multimodal Conversational Technology and Audiovisual Analytic Measures for the Assessment and Monitoring of Amyotrophic Lateral Sclerosis at Scale <BR>(Oral presentation)

Investigating the Utility of Multimodal Conversational Technology and Audiovisual Analytic Measures for the Assessment and Monitoring of Amyotrophic Lateral Sclerosis at Scale
(Oral presentation)

Michael Neumann (Modality.AI, USA), Oliver Roesler (Modality.AI, USA), Jackson Liscombe (Modality.AI, USA), Hardik Kothare (Modality.AI, USA), David Suendermann-Oeft (Modality.AI, USA), David Pautler (Modality.AI, USA), Indu Navar (Peter Cohen Foundation, USA), Aria Anvar (Peter Cohen Foundation, USA), Jochen Kumm (Pr3vent, USA), Raquel Norel (IBM, USA), Ernest Fraenkel (MIT, USA), Alexander V. Sherman (MGH Institute of Health Professions, USA), James D. Berry (MGH Institute of Health Professions, USA), Gary L. Pattee (University of Nebraska, USA), Jun Wang (University of Texas at Austin, USA), Jordan R. Green (MGH Institute of Health Professions, USA), Vikram Ramanarayanan (Modality.AI, USA)

We propose a cloud-based multimodal dialog platform for the remote assessment and monitoring of Amyotrophic Lateral Sclerosis (ALS) at scale. This paper presents our vision, technology setup, and an initial investigation of the efficacy of the various acoustic and visual speech metrics automatically extracted by the platform. 82 healthy controls and 54 people with ALS (pALS) were instructed to interact with the platform and completed a battery of speaking tasks designed to probe the acoustic, articulatory, phonatory, and respiratory aspects of their speech. We find that multiple acoustic (rate, duration, voicing) and visual (higher order statistics of the jaw and lip) speech metrics show statistically significant differences between controls, bulbar symptomatic and bulbar pre-symptomatic patients. We report on the sensitivity and specificity of these metrics using five-fold cross-validation. We further conducted a LASSO-LARS regression analysis to uncover the relative contributions of various acoustic and visual features in predicting the severity of patients’ ALS (as measured by their self-reported ALSFRSR scores). Our results provide encouraging evidence of the utility of automatically extracted audiovisual analytics for scalable remote patient assessment and monitoring in ALS.

Search in Audio

Related Recordings

Automatic Speech Recognition of Disordered Speech: Personalized models outperforming human listeners on short phrases
(Oral presentation)

Jordan R. Green , Robert L. MacDonald , Pan-Pan Jiang , Julie Cattiau , Rus Heywood , Richard Cave , Katie Seaver , Marilyn A. Ladewig , Jimmy Tobin , Michael P. Brenner , Philip C. Nelson , Katrin Tomanek

Handling acoustic variation in dysarthric speech recognition systems through model combination
(Oral presentation)

Enno Hermann , Mathew Magimai-Doss

InterSpeech 2021

Investigating the Utility of Multimodal Conversational Technology and Audiovisual Analytic Measures for the Assessment and Monitoring of Amyotrophic Lateral Sclerosis at Scale (Oral presentation)

Search in Audio

Related Recordings

Automatic Speech Recognition of Disordered Speech: Personalized models outperforming human listeners on short phrases (Oral presentation)

Handling acoustic variation in dysarthric speech recognition systems through model combination (Oral presentation)

Investigating the Utility of Multimodal Conversational Technology and Audiovisual Analytic Measures for the Assessment and Monitoring of Amyotrophic Lateral Sclerosis at Scale
(Oral presentation)

Automatic Speech Recognition of Disordered Speech: Personalized models outperforming human listeners on short phrases
(Oral presentation)

Handling acoustic variation in dysarthric speech recognition systems through model combination
(Oral presentation)