SuperLectures.com

DISCRIMINATIVE DURATION MODELING FOR SPEECH RECOGNITION WITH SEGMENTAL CONDITIONAL RANDOM FIELDS

Full Paper at IEEE Xplore

Speech Analysis

Přednášející: Patrick Nguyen, Autoři: Justine Kao, Stanford University, United States; Geoffrey Zweig, Patrick Nguyen, Microsoft Research, United States

This paper describes a new approach to modeling duration for LVCSR using SCARF, a toolkit for speech recognition with segmental conditional random fields. We utilize SCARF’s ability to integrate long-span, segment-level features to design and test duration models that help discriminate between correct and incorrect word hypotheses. We show that the duration distributions of correct and incorrect word hypotheses differ. Given a word hypothesis in the lattice and its duration, conditional length probabilities are integrated to the SCARF system as duration features. We evaluate three kinds of duration features on Broadcast News: word, pre- and post-pausal durations, and word span confusions. Adding the duration features to SCARF results in an up to 0.3% improvement over a state-of-the-art discriminatively trained baseline of 15.3% WER on a Broadcast News task.


  Přepis řeči

|

  Slajdy

Zvětšit slajd | Zobrazit všechny slajdy

0:00:16

  1. slajd

0:00:30

  2. slajd

0:01:32

  3. slajd

0:03:34

  4. slajd

0:08:14

  5. slajd

0:09:21

  6. slajd

0:10:00

  7. slajd

0:11:15

  8. slajd

0:12:09

  9. slajd

0:13:01

 10. slajd

  Komentáře

Please sign in to post your comment!

  Informace o přednášce

Nahráno: 2011-05-25 10:30 - 10:50, Panorama
Přidáno: 15. 6. 2011 16:48
Počet zhlédnutí: 30
Rozlišení videa: 1024x576 px, 512x288 px
Délka videa: 0:15:07
Audio stopa: MP3 [5.08 MB], 0:15:07