Odyssey 2016

The Speaker and Language Recognition Workshop

Age-Related Voice Disguise and its Impact on Speaker Verification Accuracy

Rosa González Hautamäki, Md Sahidullah, Tomi Kinnunen, Ville Hautamäki
This study focuses in the impact of age-related intentional voice modification, or age disguise, on the performance of automatic speaker verification (ASV) systems. The data collected for this study includes 60 native Finnish speakers (29 males, 31 females) with age range between 18 and 73 years. The corpus consist of two sessions of read speech per speaker. Our experiments demonstrate vulnerability of modern ASV systems when a person attempts to conceal his or her identity, by modifying the voice to sound like an old or young person. For our i-vector PLDA system, the increase in equal error rate (EER), in the case of male speakers, was 7-fold for the attempt of old voice and 11-fold for young voice. Similar degradation in performance is observed for female speakers with a 5-fold increase in EER for old voice disguise and a 6-fold increase for young voice disguise. We further analyze the factors affecting the performance of ASV systems for the studied speech data. In our experiments, male speakers were found more successful in disguising their voices. The effect on fundamental frequency (F0) was also studied. The mean F0 distributions showed a shift towards higher frequencies when speakers attempted a young voice, which relates to the perception that younger speakers F0 values tend to be higher than for older speakers.