InterSpeech 2021

Measuring Voice Quality Parameters after Speaker Pseudonymization
(Oral presentation)

Rob J.J.H. van Son (Netherlands Cancer Institute, The Netherlands)
Collecting and sharing speech resources is important for progress in speech science and technology. Often, speech resources cannot be shared because of concerns over the privacy of the speakers, e.g., minors or people with medical conditions. Current technologies for pseudonymizing speech have only been tested on “standard” speech for which pseudonymization methods are evaluated on speaker identification risk, intelligibility, and naturalness. For many applications, the important characteristics are para-linguistic aspects of the speech, e.g., voice quality, emotion, or disease progression. Little information is available about the extent to which speaker pseudonymization methods preserve such paralinguistic information. The current study investigates how well voice quality parameters are preserved by an example speech pseudonymization application. Correlations prove to be high between original and pseudonymized recordings for seven acoustic parameters and a composite measure of dysphonia, the AVQI. Root mean square errors for these parameters were reasonably small. A linear mixed effect model shows a link between the difference between source and target speaker and the size of the absolute difference in the AVQI. It is argued that new measures of quality are needed for pseudonymized non-standard speech before wide-spread application of pseudonymized speech can be considered in research and clinical practise.