InterSpeech 2021

Identifying Conflict Escalation and Primates by Using Ensemble X-vectors and Fisher Vector Features
(Oral presentation)

José Vicente Egas-López (University of Szeged, Hungary), Mercedes Vetráb (University of Szeged, Hungary), László Tóth (University of Szeged, Hungary), Gábor Gosztolya (University of Szeged, Hungary)
Computational paralinguistics is concerned with the automatic identification of non-verbal information in human speech. The Interspeech ComParE challenge features new paralinguistic tasks each year; this time, among others, a cross-corpus conflict escalation task and the identification of primates based solely on audio are the actual problems set. In our entry to ComParE 2021, we utilize x-vectors and Fisher vectors as features. To improve the robustness of the predictions, we also experiment with building an ensemble of classifiers from the x-vectors. Lastly, we exploit the fact that the Escalation Sub-Challenge is a conflict detection task, and incorporate the SSPNet Conflict Corpus in our training workflow. Using these approaches, at the time of writing, we had already surpassed the official Challenge baselines on both tasks, which demonstrates the efficiency of the employed techniques.