Bidirectional Multiscale Feature Aggregation for Speaker Verification <BR>(3 minutes introduction)

Bidirectional Multiscale Feature Aggregation for Speaker Verification
(3 minutes introduction)

Jiajun Qi (USTC, China), Wu Guo (USTC, China), Bin Gu (USTC, China)

In this paper, we propose a novel bidirectional multiscale feature aggregation (BMFA) network with attentional fusion modules for text-independent speaker verification. The feature maps from different stages of the backbone network are iteratively combined and refined in both a bottom-up and top-down manner. Furthermore, instead of simple concatenation or elementwise addition of feature maps from different stages, an attentional fusion module is designed to compute the fusion weights. Experiments are conducted on the NIST SRE16 and VoxCeleb1 datasets. The experimental results demonstrate the effectiveness of the bidirectional aggregation strategy and show that the proposed attentional fusion module can further improve the performance.

Search in Audio

Related Recordings

Improving Time Delay Neural Network Based Speaker Recognition With Convolutional Block And Feature Aggregation Methods
(3 minutes introduction)

Yu-Jia Zhang , Yih-Wen Wang , Chia-Ping Chen , Chung-Li Lu , Bo-Cheng Chan

Improving Time Delay Neural Network Based Speaker Recognition With Convolutional Block And Feature Aggregation Methods
(longer introduction)

Yu-Jia Zhang , Yih-Wen Wang , Chia-Ping Chen , Chung-Li Lu , Bo-Cheng Chan

InterSpeech 2021

Bidirectional Multiscale Feature Aggregation for Speaker Verification (3 minutes introduction)

Search in Audio

Related Recordings

Improving Time Delay Neural Network Based Speaker Recognition With Convolutional Block And Feature Aggregation Methods (3 minutes introduction)

Improving Time Delay Neural Network Based Speaker Recognition With Convolutional Block And Feature Aggregation Methods (longer introduction)

Bidirectional Multiscale Feature Aggregation for Speaker Verification
(3 minutes introduction)

Improving Time Delay Neural Network Based Speaker Recognition With Convolutional Block And Feature Aggregation Methods
(3 minutes introduction)

Improving Time Delay Neural Network Based Speaker Recognition With Convolutional Block And Feature Aggregation Methods
(longer introduction)