On the Use of PLDA i-vector Scoring for Clustering Short Segments

Itay Salmun, Irit Opher, Itshak Lapidot

This paper extends upon a previous work using Mean Shift algorithm to perform speaker clustering on i-vectors generated from short speech segments. In this paper we examine the effectiveness of probabilistic linear discriminant analysis (PLDA) scoring as the metric of the mean shift clustering algorithm in the presence of different number of speakers. Our proposed method, combined with k-nearest neighbors (kNN) for bandwidth estimation, yields better and more robust results in comparison to the cosine similarity with fixed neighborhood bandwidth for clustering segments of large number of speakers. In the case of 30 speakers, we achieved evaluation parameter of 72.1 with the PLDA-based mean shift algorithm compared to 65.9 with the cosine-based baseline system.

Switch Camera

Odyssey 2016

The Speaker and Language Recognition Workshop

On the Use of PLDA i-vector Scoring for Clustering Short Segments

Search in Audio

Speech Transcript

Related Recordings

Analysis of the Impact of the Audio Database Characteristics in the Accuracy of a Speaker Clustering System

Short- and Long-Term Speech Features for Hybrid HMM-i-Vector based Speaker Diarization System