Speech Disorder Classification Using Extended Factorized Hierarchical Variational Auto-encoders <BR>(Oral presentation)

Speech Disorder Classification Using Extended Factorized Hierarchical Variational Auto-encoders
(Oral presentation)

Jinzi Qi (KU Leuven, Belgium), Hugo Van hamme (KU Leuven, Belgium)

Objective speech disorder classification for speakers with communication difficulty is desirable for diagnosis and administering therapy. With the current state of speech technology, it is evident to propose neural networks for this application. But neural network model training is hampered by a lack of labeled disordered speech data. In this research, we apply an extended version of Factorized Hierarchical Variational Auto-encoders (FHVAE) for representation learning on disordered speech. The FHVAE model extracts both content-related and sequence-related latent variables from speech data, and we utilize the extracted variables to explore how disorder type information is represented in the latent variables. For better classification performance, the latent variables are aggregated at the word and sentence level. We show that an extension of the FHVAE model succeeds in the better disentanglement of the content-related and sequence-related related representations, but both representations are still required for best results on disorder type classification.

Search in Audio

Related Recordings

Automatic extraction of speech rhythm descriptors for speech intelligibility assessment in the context of head and neck cancers
(Oral presentation)

Robin Vaysse , France), Jérôme Farinas , France), Corine Astésano , France), Régine André-Obrecht , France)

The Impact of Forced-Alignment Errors on Automatic Pronunciation Evaluation
(Oral presentation)

Vikram C. Mathad , Tristan J. Mahr , Nancy Scherer , Kathy Chapman , Katherine C. Hustad , Julie Liss , Visar Berisha

InterSpeech 2021

Speech Disorder Classification Using Extended Factorized Hierarchical Variational Auto-encoders (Oral presentation)

Search in Audio

Related Recordings

Automatic extraction of speech rhythm descriptors for speech intelligibility assessment in the context of head and neck cancers (Oral presentation)

The Impact of Forced-Alignment Errors on Automatic Pronunciation Evaluation (Oral presentation)

Speech Disorder Classification Using Extended Factorized Hierarchical Variational Auto-encoders
(Oral presentation)

Automatic extraction of speech rhythm descriptors for speech intelligibility assessment in the context of head and neck cancers
(Oral presentation)

The Impact of Forced-Alignment Errors on Automatic Pronunciation Evaluation
(Oral presentation)