InterSpeech 2021

Improvement of Automatic English Pronunciation Assessment with Small Number of Utterances Using Sentence Speakability
(3 minutes introduction)

Satsuki Naijo (Tohoku University, Japan), Akinori Ito (Tohoku University, Japan), Takashi Nose (Tohoku University, Japan)
The current Computer-Assisted Pronunciation Training (CAPT) system uses DNN-based speech recognition results to evaluate learner’s pronunciation with high accuracy when using many utterances for the evaluation. However, when we use only a few utterances, the accuracy of the CAPT system deteriorates. One reason for the deterioration is that the score calculated by a CAPT system is biased depending on the pronunciation difficulty of the sentences when using a small number of utterances. In this study, we developed a CAPT system that takes the sentence speakability (pronunciation difficulty of sentences) into account. As a result, the correlation coefficient between the human evaluation and the machine score was 0.46 in the conventional method, while it improved to 0.57 with the proposed method.