InterSpeech 2021

Privacy-Preserving Feature Extraction for Cloud-Based Wake Word Verification
(3 minutes introduction)

Timm Koppelmann (Ruhr-Universität Bochum, Germany), Alexandru Nelus (Ruhr-Universität Bochum, Germany), Lea Schönherr (Ruhr-Universität Bochum, Germany), Dorothea Kolossa (Ruhr-Universität Bochum, Germany), Rainer Martin (Ruhr-Universität Bochum, Germany)
Wake word detection and verification systems often involve a local, on-device wake word detector and a cloud-based verification node. In such systems, the audio representation sent to the cloud-based server may exhibit sensitive information that might be intercepted by an eavesdropper. To improve privacy of cloud-based wake word verification (WWV) systems, we propose to use a privacy-preserving feature representation that minimizes the automatic speech recognition (ASR) capability of a potential attacker. The proposed approach employs an adversarial training schedule that aims to minimize an attacker’s word error rate (WER) while maintaining a high WWV performance. To this end, we apply an adaptive weighting factor in the combined loss function to control the balance between minimizing the WWV loss and maximizing the ASR loss. We show that the proposed training method significantly reduces possible privacy risks while maintaining a strong WWV performance.