InterSpeech 2021

Remote smartphone-based speech collection: acceptance and barriers in individuals with major depressive disorder
(3 minutes introduction)

Judith Dineley (Universität Augsburg, Germany), Grace Lavelle (King’s College London, UK), Daniel Leightley (King’s College London, UK), Faith Matcham (King’s College London, UK), Sara Siddi (CIBERSAM, Spain), Maria Teresa Peñarrubia-María (IDIAP Jordi Gol, Spain), Katie M. White (King’s College London, UK), Alina Ivan (King’s College London, UK), Carolin Oetzmann (King’s College London, UK), Sara Simblett (King’s College London, UK), Erin Dawe-Lane (King’s College London, UK), Stuart Bruce (King’s College London, UK), Daniel Stahl (King’s College London, UK), Yatharth Ranjan (King’s College London, UK), Zulqarnain Rashid (King’s College London, UK), Pauline Conde (King’s College London, UK), Amos A. Folarin (King’s College London, UK), Josep Maria Haro (CIBERSAM, Spain), Til Wykes (King’s College London, UK), Richard J.B. Dobson (King’s College London, UK), Vaibhav A. Narayan (Janssen, USA), Matthew Hotopf (King’s College London, UK), Björn W. Schuller (Universität Augsburg, Germany), Nicholas Cummins (Universität Augsburg, Germany), The RADAR-CNS Consortium ()
The ease of in-the-wild speech recording using smartphones has sparked considerable interest in the combined application of speech, remote measurement technology (RMT) and advanced analytics as a research and healthcare tool. For this to be realised, the acceptability of remote speech collection to the user must be established, in addition to feasibility from an analytical perspective. To understand the acceptance, facilitators, and barriers of smartphone-based speech recording, we invited 384 individuals with major depressive disorder (MDD) from the Remote Assessment of Disease and Relapse — Central Nervous System (RADAR-CNS) research programme in Spain and the UK to complete a survey on their experiences recording their speech. In this analysis, we demonstrate that study participants were more comfortable completing a scripted speech task than a free speech task. For both speech tasks, we found depression severity and country to be significant predictors of comfort. Not seeing smartphone notifications of the scheduled speech tasks, low mood and forgetfulness were the most commonly reported obstacles to providing speech recordings.