A Multi-condition Training Strategy for Countermeasures Against Spoofing Attacks to Speaker Recognizers
|Joao Monteiro, Jahangir Alam, Tiago Falk|
In this contribution, we are concerned with the design of effective strategies to train simple-to-use detectors of spoofing attacks to automatic speaker recognizers, i.e., systems able to directly map data into scores indicating the likelihood of an attack, as opposed to complex pipelines involving several independent steps required for training and inference. As such, given that artificial neural networks have been responsible for the shift from pipelines to end-to-end systems within several applications, we specifically target this kind of model. The main challenge in training neural networks for the applications considered herein lies in the fact that openly available spoofing corpora are relatively small due to the inherent difficulty involved in collecting/generating this kind of data. We thus employ a data augmentation strategy enabling the introduction of training examples which significantly improves training data in terms of size and diversity. Neural networks trained on top of augmented training data are shown to be able to attain significant improvement in terms of detection performance when compared to standard GMM-based classifiers.