SuperLectures.com

SPEAKER CHARACTERIZATION USING SPECTRAL SUBBAND ENERGY RATIO BASED ON HARMONIC PLUS NOISE MODEL

Full Paper at IEEE Xplore

Miscellaneous Speaker Identification

Přednášející: Zhi-Jie Yan, Autoři: Yanhua Long, University of Science and Technology of China, China; Zhi-Jie Yan, Frank K. Soong, Microsoft Research Asia, China; Li-Rong Dai, Wu Guo, University of Science and Technology of China, China

This paper proposes a feature extraction for speaker characterization by exploring the relationship between the two distinct components of the speech signal, one is harmonics accounting for the periodicity of the signal and the other is modulated noise accounting for the turbulences of the glottal airflow. The harmonic and noise parts of the speech signal are decomposed based on the Harmonic plus Noise Model approach. We estimate the spectral subband energy ratios (SSERs) as the speaker characteristic features, which are expected to reflect the interaction property of the vocal tract and glottal airflow of individual speakers for speaker verification. The speaker verification experiments based on a GMM-UBM system have shown the efficiency of the SSER features, reducing the error equal rate by 27.2% by combining with the conventional MFCC features.


  Přepis řeči

|

  Slajdy

Zvětšit slajd | Zobrazit všechny slajdy

0:00:16

  1. slajd

0:01:11

  2. slajd

0:01:50

  3. slajd

0:02:44

  4. slajd

0:03:48

  5. slajd

0:05:30

  6. slajd

0:06:23

  7. slajd

0:07:34

  8. slajd

0:08:44

  9. slajd

0:09:12

 10. slajd

0:10:10

 11. slajd

0:10:26

 12. slajd

0:11:22

 13. slajd

0:11:47

 14. slajd

0:12:21

 15. slajd

0:13:37

 16. slajd

0:14:35

 17. slajd

  Komentáře

Please sign in to post your comment!

  Informace o přednášce

Nahráno: 2011-05-25 16:55 - 17:15, Panorama
Přidáno: 15. 6. 2011 18:05
Počet zhlédnutí: 44
Rozlišení videa: 1024x576 px, 512x288 px
Délka videa: 0:22:35
Audio stopa: MP3 [7.65 MB], 0:22:35