InterSpeech 2021

Assessment of von Mises--Bernoulli Deep Neural Network in Sound Source Localization
(3 minutes introduction)

Katsutoshi Itoyama (Tokyo Tech, Japan), Yoshiya Morimoto (Tokyo Tech, Japan), Shungo Masaki (Tokyo Tech, Japan), Ryosuke Kojima (Kyoto University, Japan), Kenji Nishida (Tokyo Tech, Japan), Kazuhiro Nakadai (Tokyo Tech, Japan)
This paper addresses the properties and effectiveness of the von Mises-Bernoulli deep neural network (vM-B DNN), a neural network capable of learning periodic information, in sound source localization. The phase, which is periodic information, is an important cue in sound source localization, but typical neural network cannot handle periodic input values properly. The vM-B DNN has been theoretically revealed to be able to handle periodic input values and its effectiveness has been shown in a simple case study of sound source localization using artificial sinusoids, but it was not in the case of speech signals. We conducted both numerical simulation and actual environment experiments. We compared a sound source localization method using vM-B DNN with those using ordinary neural networks, and showed that the vM-B DNN outperforms other methods under various conditions.