Complementary Combination in i-Vector Level for Language Recognition

Presented by:

Zhi-Yi Li

Author(s):

Zhi-Yi Li, Wei-Qiang Zhang, Liang He and Jia Liu

Recently, i-vector based technology can provide good performance in language recognition (LRE). From the viewpoint of information theory, i-vectors derived from different acoustic features can contain more useful and complementary language information. In this paper, we propose an effective complementary combination method for i-vectors, which derived from two different complementary acoustic features respectively: the popular short-term spectral shifted delta cepstral (SDC) and new spectro-temporal time-frequency cepstrum (TFC). In order to reduce the high dimension of new combined i-vectors and to remove the redundant information, principal component analysis (PCA) and linear discriminant analysis (LDA) are used respectively and the performances are evaluated. Moreover, two popular classifiers including cosine distance scoring (CDS) and support vector machine (SVM) are applied to model the combined low-dimensional i-vectors. The experiments are performed on the NIST LRE 2009 dataset, and the results show that the proposed combination method can effectively provide the better performance with lower dimension. The performance of the best system show that the EER can reduce 1% than the relative baseline systems for 30 s duration and 2.3% for 10 s and 3 s durations.

Odyssey 2012

The Speaker and Language Recognition Workshop

Complementary Combination in i-Vector Level for Language Recognition

Search in Audio

Related Recordings

Speaker Vectors from Subspace Gaussian Mixture Model as Complementary Features for Language Identification

Bhattacharyya-based GMM-SVM System with Adaptive Relevance Factor for Pair Language Recognition