Novel Variable Length Teager Energy Profiles for Replay Spoof Detection

Madhu Kamble, Hemant Patil

Replay attacks are developed in order to get fraudulent access of an Automatic Speaker Verification (ASV) system. This attack requires only recording and playback devices. The replay speech gets affected by the use of quality of intermediate devices, and the level of noise present in the acoustic environment. In this paper, we propose Variable length Teager Energy Cepstral Coefficients (VTECC) for replay Spoof Speech Detection (SSD) task. Varying the Dependency Index in Variable length Teager Energy Operator (VTEO) changes the performance of SSD system. The Teager energy profiles and the spectral energy densities obtained show the discrimination information for different DIs. With DI=5, we got reduced % Equal Error Rate (EER) of 6.52 % and 11.93 % on development and evaluation set, respectively, on ASVspoof 2017 version 2.0 challenge database. We further used score-level fusion of baseline system (Constant Q Cepstral Coefficients (CQCC) feature set) and VTECC and reduced the % EER to 5.85 % and 10.94 % on development and evaluation set, respectively. Furthermore, for evaluation set, we investigate the performance on different Replay Configurations (RC). For all the levels of threats, the proposed feature set performed better compared to the other feature sets.　

An Initial Investigation on Optimizing Tandem Speaker Verification and Countermeasure Systems Using Reinforcement Learning

Anssi Kanervisto, Ville Hautamäki, Tomi Kinnunen, Junichi Yamagishi

Odyssey 2020

The Speaker and Language Recognition Workshop

Novel Variable Length Teager Energy Profiles for Replay Spoof Detection

Search in Audio

Speech Transcript

Related Recordings

Using Multi-Resolution Feature Maps with Convolutional Neural Networks for Anti-Spoofing in ASV

An Initial Investigation on Optimizing Tandem Speaker Verification and Countermeasure Systems Using Reinforcement Learning