SuperLectures.com

IMPROVING TEXT-INDEPENDENT PHONETIC SEGMENTATION BASED ON THE MICROCANONICAL MULTISCALE FORMALISM

Speech Analysis

Full Paper at IEEE Xplore

Presented by: Vahid Khanagha, Author(s): Vahid Khanagha, Khalid Daoudi, Oriol Pont, Hussein Yahia, INRIA Bordeaux Sud-Ouest, France

In an earlier work, we proposed a novel phonetic segmentation method based on speech analysis under the Microcanonical Multiscale Formalism (MMF). The latter relies on the computation of local geometrical parameters, singularity exponents (SE). We showed that SE convey valuable information about the local dynamics of speech that can readily and simply used to detect phoneme boundaries. By performing error analysis of our original algorithm, in this paper we propose a 2-steps technique which better exploits SE to improve the segmentation accuracy. In the first step, we detect the boundaries of the original signal and of a low-pass filtred version, and we consider the union of all detected boundaries as candidates. In the second step, we use a hypothesis test over the local SE distribution of the original signal to select the final boundaries. We carry out a detailed evaluation and comparison over the full training set of the TIMIT database which could be useful to other researchers for comparison purposes. The results show that the new algorithm not only outperforms the original one, but also is significantly much more accurate than state-of-the-art ones.


  Speech Transcript

|

  Slides

Enlarge the slide | Show all slides in a pop-up window

0:00:16

  1. slide

0:00:30

  2. slide

0:01:17

  3. slide

0:02:26

  4. slide

0:03:45

  5. slide

0:04:26

  6. slide

0:06:19

  7. slide

0:08:40

  8. slide

0:09:28

  9. slide

0:10:00

 10. slide

0:10:58

 11. slide

0:13:54

 12. slide

0:14:42

 13. slide

0:17:23

 14. slide

0:17:38

 15. slide

0:18:16

 16. slide

  Comments

Please sign in to post your comment!

  Lecture Information

Recorded: 2011-05-25 11:10 - 11:30, Panorama
Added: 15. 6. 2011 16:35
Number of views: 24
Video resolution: 1024x576 px, 512x288 px
Video length: 0:19:28
Audio track: MP3 [6.58 MB], 0:19:28