Age-Related Voice Disguise and its Impact on Speaker Verification Accuracy

Rosa González Hautamäki, Md Sahidullah, Tomi Kinnunen, Ville Hautamäki

This study focuses in the impact of age-related intentional voice modification, or age disguise, on the performance of automatic speaker verification (ASV) systems. The data collected for this study includes 60 native Finnish speakers (29 males, 31 females) with age range between 18 and 73 years. The corpus consist of two sessions of read speech per speaker. Our experiments demonstrate vulnerability of modern ASV systems when a person attempts to conceal his or her identity, by modifying the voice to sound like an old or young person. For our i-vector PLDA system, the increase in equal error rate (EER), in the case of male speakers, was 7-fold for the attempt of old voice and 11-fold for young voice. Similar degradation in performance is observed for female speakers with a 5-fold increase in EER for old voice disguise and a 6-fold increase for young voice disguise. We further analyze the factors affecting the performance of ASV systems for the studied speech data. In our experiments, male speakers were found more successful in disguising their voices. The effect on fundamental frequency (F0) was also studied. The mean F0 distributions showed a shift towards higher frequencies when speakers attempted a young voice, which relates to the perception that younger speakers F0 values tend to be higher than for older speakers.

Switch Camera

Odyssey 2016

The Speaker and Language Recognition Workshop

Age-Related Voice Disguise and its Impact on Speaker Verification Accuracy

Search in Audio

Speech Transcript

Related Recordings

Spoofing Detection on the ASVspoof2015 Challenge Corpus Employing Deep Neural Networks

A New Feature for Automatic Speaker Verification Anti-Spoofing: Constant Q Cepstral Coefficients