SuperLectures.com

ONLINE DETECTION OF VOCAL LISTENER RESPONSES WITH MAXIMUM LATENCY CONSTRAINTS

Audio/Visual Detection of Non-Linguistic Vocal Outbursts

Full Paper at IEEE Xplore

Přednášející: Daniel Neiberg, Autoři: Daniel Neiberg, KTH - Royal Institute of Technology, Sweden; Khiet P. Truong, University of Twente, Netherlands

When human listeners utter Listener Responses (e.g. back-channels or acknowledgments) such as `yeah' and `mmhmm', interlocutors commonly continue to speak or resume their speech even before the listener has finished his/her response. This type of speech interactivity results in frequent speech overlap which is common in human-human conversation. To allow for this type of speech interactivity to occur between humans and spoken dialog systems, which will result in more human-like continuous and smoother human-machine interaction, we propose an on-line classifier which can classify incoming speech as Listener Responses. We show that it is possible to detect vocal Listener Responses using maximum latency thresholds of 100-500 ms, thereby obtaining equal error rates ranging from 34% to 28% by using an energy based voice activity detector.


  Přepis řeči

|

  Slajdy

Zvětšit slajd | Zobrazit všechny slajdy

0:00:16

  1. slajd

0:00:28

  2. slajd

0:01:04

  3. slajd

0:01:28

  4. slajd

0:02:04

  5. slajd

0:02:49

  6. slajd

0:03:06

  7. slajd

0:04:01

  8. slajd

0:04:40

  9. slajd

0:05:05

 10. slajd

0:05:44

 11. slajd

0:06:24

 12. slajd

0:07:18

 13. slajd

0:07:54

 14. slajd

0:08:47

 15. slajd

0:09:33

 16. slajd

  Komentáře

Please sign in to post your comment!

  Informace o přednášce

Nahráno: 2011-05-25 14:25 - 14:45, Club D
Přidáno: 20. 6. 2011 00:17
Počet zhlédnutí: 22
Rozlišení videa: 1024x576 px, 512x288 px
Délka videa: 0:11:42
Audio stopa: MP3 [3.91 MB], 0:11:42