Odyssey 2012

The Speaker and Language Recognition Workshop

A Hybrid Factor Analysis and Probabilistic PCA-based system for Dictionary Learning and Encoding for Robust Speaker Recognition

Srikanth Madikeri

Probabilistic Principal Component Analysis (PPCA) based low dimensional representation of speech utterances are found to be useful for speaker recognition. Although, performance of the FA (Factor Analysis)-based total variability space model is found to be superior, hyperparameter estimation procedure in PPCA is computationally efficient. In this work, recent insight on the FA-based approach as a combination of dictionary learning and encoding is explored to use its encoding procedure in the PPCA framework. With the use of an alternate encoding technique on dictionaries learnt using PPCA, performance of state-of-the-art FA-based ivector approach is matched by using the proposed procedure. A speed up of 4x is obtained while estimating the hyperparameter at the cost of 0.51% deterioration in performance in terms of the Equal Error Rate (EER) in the worst case. Compared to the conventional PPCA model, absolute improvements of 2.1\% and 2.8\% are observed on two telephone conditions of NIST 2008 SRE database. Using Canonical Correlational Analysis, it is shown that the iVectors extracted from the conventional FA model and the proposed approach are highly correlated.