SuperLectures.com

AN INVESTIGATION OF SUBSPACE MODELING FOR PHONETIC AND SPEAKER VARIABILITY IN AUTOMATIC SPEECH RECOGNITION

Full Paper at IEEE Xplore

Acoustic Modeling

Přednášející: Richard Rose, Autoři: Richard Rose, Shou-Chun Yin, Yun Tang, McGill University, Canada

This paper investigates the impact of subspace based techniques for modeling speaker variability and phonetic variability in automatic speech recognition(ASR). There are many well known approaches to speaker space based adaptation which represent sources of variability as a projection within a low dimensional subspace. A new approach to acoustic modeling in ASR, referred to as the subspace based Gaussian mixture model (SGMM), represents phonetic variability as a set of projections applied at the state level in a hidden Markov model (HMM) based acoustic model. The impact of the SGMM in modeling these intrinsic sources of variability is evaluated for a continuous speech recognition (CSR) task where the performance of continuous density HMM(CDHMM) based ASR systems is already reasonably good. Speaker independent SGMM based ASR was shown to provide an 18% reduction in word error rate (WER) over the CDHMM and a 5% reduction in WER over unsupervised speaker adaptation in the resource management CSR domain.


  Přepis řeči

|

  Slajdy

Zvětšit slajd | Zobrazit všechny slajdy

0:00:16

  1. slajd

0:00:39

  2. slajd

0:02:29

  3. slajd

0:03:57

  4. slajd

0:05:04

  5. slajd

0:08:39

  6. slajd

0:09:44

  7. slajd

0:11:36

  8. slajd

0:12:47

  9. slajd

0:14:10

 10. slajd

0:15:02

 11. slajd

0:16:46

 12. slajd

0:18:17

 13. slajd

0:19:44

     7. slajd

  Komentáře

Please sign in to post your comment!

  Informace o přednášce

Nahráno: 2011-05-25 15:25 - 15:45, Panorama
Přidáno: 15. 6. 2011 17:03
Počet zhlédnutí: 73
Rozlišení videa: 1024x576 px, 512x288 px
Délka videa: 0:20:20
Audio stopa: MP3 [6.87 MB], 0:20:20