InterSpeech 2021

Influence of the Interviewer on the Automatic Assessment of Alzheimer's Disease in the Context of the ADReSSo Challenge
(3 minutes introduction)

P.A. Pérez-Toro (FAU Erlangen-Nürnberg, Germany), S.P. Bayerl (TH Nürnberg, Germany), T. Arias-Vergara (FAU Erlangen-Nürnberg, Germany), J.C. Vásquez-Correa (FAU Erlangen-Nürnberg, Germany), P. Klumpp (FAU Erlangen-Nürnberg, Germany), M. Schuster (LMU München, Germany), Elmar Nöth (FAU Erlangen-Nürnberg, Germany), J.R. Orozco-Arroyave (FAU Erlangen-Nürnberg, Germany), K. Riedhammer (TH Nürnberg, Germany)
Alzheimer’s Disease (AD) results from the progressive loss of neurons in the hippocampus, which affects the capability to produce coherent language. It affects lexical, grammatical, and semantic processes as well as speech fluency. This paper considers the analyses of speech and language for the assessment of AD in the context of the Alzheimer’s Dementia Recognition through Spontaneous Speech (ADReSSo) 2021 challenge. We propose to extract acoustic features such as X-vectors, prosody, and emotional embeddings as well as linguistic features such as perplexity, and word-embeddings. The data consist of speech recordings from AD patients and healthy controls. The transcriptions are obtained using a commercial automatic speech recognition system. We outperform baseline results on the test set, both for the classification and the Mini-Mental State Examination (MMSE) prediction. We achieved a classification accuracy of 80% and an RMSE of 4.56 in the regression. Additionally, we found strong evidence for the influence of the interviewer on classification results. In cross-validation on the training set, we get classification results of 85% accuracy using the combined speech of the interviewer and the participant. Using interviewer speech only we still get an accuracy of 78%. Thus, we provide strong evidence for interviewer influence on classification results.