|Abbas Khosravani, Mohammad Mehdi Homayounpour, Dijana Petrovska-Delacrétaz, Gérard Chollet|
There are many factors affecting the variability of an i-vector extracted from a speech segment such as the acoustic content, segment duration, handset type and background noise. The state-of-the-art Probabilistic Linear Discriminant Analysis (PLDA) tries to model all these sources of undesirable variabilities within a single covariance matrix. Although techniques such as source normalization have been proposed to reduce the effect of different sources of variability as a preprocessing for PLDA, still the performance of speaker recognition is affected under cross-source evaluation condition. This study aims at proposing a language-independent PLDA training algorithm in order to reduce the effect of language on the performance of speaker recognition. An accurate estimation of speaker and channel subspaces from a multilingual training dataset which are void of language variability can assist PLDA to work independent of the language. When evaluated on the NIST 2008 speaker recognition multilingual trials, our proposed solution demonstrates relative improvement of up to 10% in equal error rate (EER) and 6.4% in minimum DCF.