A PLDA Approach for Language and Text Independent Speaker Recognition

Abbas Khosravani, Mohammad Mehdi Homayounpour, Dijana Petrovska-Delacrétaz, Gérard Chollet

There are many factors affecting the variability of an i-vector extracted from a speech segment such as the acoustic content, segment duration, handset type and background noise. The state-of-the-art Probabilistic Linear Discriminant Analysis (PLDA) tries to model all these sources of undesirable variabilities within a single covariance matrix. Although techniques such as source normalization have been proposed to reduce the effect of different sources of variability as a preprocessing for PLDA, still the performance of speaker recognition is affected under cross-source evaluation condition. This study aims at proposing a language-independent PLDA training algorithm in order to reduce the effect of language on the performance of speaker recognition. An accurate estimation of speaker and channel subspaces from a multilingual training dataset which are void of language variability can assist PLDA to work independent of the language. When evaluated on the NIST 2008 speaker recognition multilingual trials, our proposed solution demonstrates relative improvement of up to 10% in equal error rate (EER) and 6.4% in minimum DCF.

Switch Camera

Odyssey 2016

The Speaker and Language Recognition Workshop

A PLDA Approach for Language and Text Independent Speaker Recognition

Search in Audio

Speech Transcript

Related Recordings

VOICE LIVENESS DETECTION FOR SPEAKER VERIFICATION BASED ON A TANDEM SINGLE/DOUBLE-CHANNEL POP NOISE DETECTOR

Spoofing Detection on the ASVspoof2015 Challenge Corpus Employing Deep Neural Networks