Multiclass Discriminative Training of i-vector Language Recognition

Alan Mccree

The current state-of-the-art for acoustic language recognition is an i-vector classifier followed by a discriminatively-trained multiclass back-end. This paper presents a unified approach, where a Gaussian i-vector classifier is trained using Maximum Mutual Information (MMI) to directly optimize the multiclass calibration criterion, so that no separate back-end is needed. The system is extended to the open set task by training an additional Gaussian model. Results on the NIST LRE11 standard evaluation task confirm that high performance is maintained with this new single-stage approach.

Odyssey 2014

The Speaker and Language Recognition Workshop

Multiclass Discriminative Training of i-vector Language Recognition

Search in Audio

Speech Transcript

Related Recordings

Robust Language Recognition Based on Diverse Features

Speaker-basis Accent Clustering Using Invariant Structure Analysis and the Speech Accent Archive