Analysis of the Impact of the Audio Database Characteristics in the Accuracy of a Speaker Clustering System

Jesús Jorrín Prieto, Carlos Vaquero, Paola García

In this paper, a traditional clustering algorithm based on speaker identification is presented. Several audio data sets were tested to conclude how accurate the clustering algorithm is depending on the characteristics of the analyzed database. We show that, issues such as the size of the database, the number speakers, or how the audios are balanced over the speakers in the database significantly affect the accuracy of the clustering task. These conclusions can be used to propose strategies to solve a clustering task or to predict in which situations a higher performance of the clustering algorithm is expected. We also focus on the stopping criterion to avoid the worsening of the results due to mismatch between training and testing data while using traditional stopping criteria based on maximum distance thresholds.

Switch Camera

Odyssey 2016

The Speaker and Language Recognition Workshop

Analysis of the Impact of the Audio Database Characteristics in the Accuracy of a Speaker Clustering System

Search in Audio

Speech Transcript

Related Recordings

Influence of transition cost in the segmentation stage of speaker diarization

Short- and Long-Term Speech Features for Hybrid HMM-i-Vector based Speaker Diarization System