Class-Based Neural Network Language Model For Second-Pass Rescoring In ASR <BR>(3 minutes introduction)

Class-Based Neural Network Language Model For Second-Pass Rescoring In ASR
(3 minutes introduction)

Lingfeng Dai (SJTU, China), Qi Liu (SJTU, China), Kai Yu (SJTU, China)

Language model rescoring, especially neural network language model (NNLM) rescoring, is widely used to achieve improved performance in a second-pass automatic speech recognition (ASR) system. The rescoring NNLM is usually trained separately from the ASR system. Typically, the two’s training corpora are different, leading to the vocabulary mismatch problem, consequently degrading ASR performance. Previous research focuses more on the language domain mismatch problem, while the vocabulary mismatch problem, which may also cause significant performance degradation, has not been well studied. This paper proposes a novel class-based NNLM framework to address the vocabulary mismatch problem for language model rescoring. Here, OOV words (unknown words to the rescoring NNLM are called OOV words for short) are assigned to well-trained classes of NNLM and inherit the class probability. Experiments show that class-based NNLM rescoring can significantly reduce performance degradation due to vocabulary mismatch.

InterSpeech 2021

Class-Based Neural Network Language Model For Second-Pass Rescoring In ASR
(3 minutes introduction)

Search in Audio

Related Recordings

Token-Level Supervised Contrastive Learning for Punctuation Restoration
(3 minutes introduction)

Correcting Automated and Manual Speech Transcription Errors usingWarped Language Models
(3 minutes introduction)

InterSpeech 2021

Class-Based Neural Network Language Model For Second-Pass Rescoring In ASR (3 minutes introduction)

Search in Audio

Related Recordings

Token-Level Supervised Contrastive Learning for Punctuation Restoration (3 minutes introduction)

Correcting Automated and Manual Speech Transcription Errors usingWarped Language Models (3 minutes introduction)

Class-Based Neural Network Language Model For Second-Pass Rescoring In ASR
(3 minutes introduction)

Token-Level Supervised Contrastive Learning for Punctuation Restoration
(3 minutes introduction)

Correcting Automated and Manual Speech Transcription Errors usingWarped Language Models
(3 minutes introduction)