SuperLectures.com

Search in Speech Titles Categories Author(s) Abstracts Slides

Your location: Home » ICASSP 2011 » Audio/Visual Detection of Non-Linguistic Vocal Outbursts

ONLINE DETECTION OF VOCAL LISTENER RESPONSES WITH MAXIMUM LATENCY CONSTRAINTS

Audio/Visual Detection of Non-Linguistic Vocal Outbursts

Full Paper at IEEE Xplore

Presented by: Daniel Neiberg, Author(s): Daniel Neiberg, KTH - Royal Institute of Technology, Sweden; Khiet P. Truong, University of Twente, Netherlands

When human listeners utter Listener Responses (e.g. back-channels or acknowledgments) such as `yeah' and `mmhmm', interlocutors commonly continue to speak or resume their speech even before the listener has finished his/her response. This type of speech interactivity results in frequent speech overlap which is common in human-human conversation. To allow for this type of speech interactivity to occur between humans and spoken dialog systems, which will result in more human-like continuous and smoother human-machine interaction, we propose an on-line classifier which can classify incoming speech as Listener Responses. We show that it is possible to detect vocal Listener Responses using maximum latency thresholds of 100-500 ms, thereby obtaining equal error rates ranging from 34% to 28% by using an energy based voice activity detector.

You need the Flash Player.

Share:

Download subtitles | Enlarge video

Search in Audio

Speech Transcript

Slides

Enlarge the slide | Show all slides in a pop-up window

0:00:16

1. slide

0:00:28

2. slide

0:01:04

3. slide

0:01:28

4. slide

0:02:04

5. slide

0:02:49

6. slide

0:03:06

7. slide

0:04:01

8. slide

0:04:40

9. slide

0:05:05

10. slide

0:05:44

11. slide

0:06:24

12. slide

0:07:18

13. slide

0:07:54

14. slide

0:08:47

15. slide

0:09:33

16. slide

ONLINE DETECTION OF VOCAL LISTENER RESPONSES WITH MAXIMUM LATENCY CONSTRAINTS [PDF], 0.20 MB

Comments

Please sign in to post your comment!

Links

http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=5947688

Lecture Information

Recorded:	2011-05-25 14:25 - 14:45, Club D
Added:	20. 6. 2011 00:17
Number of views:	22
Video resolution:	1024x576 px, 512x288 px
Video length:	0:11:42
Audio track:	MP3 [3.91 MB], 0:11:42

Related Lectures

0:16:33

PROCESSING ‘YUP!’ AND OTHER SHORT UTTERANCES IN INTERACTIVE SPEECH

Audio/Visual Detection of Non-Linguistic Vocal Outbursts

Added: 19. 6. 2011 17:06

0:19:47

LOCALIZATION OF NON-LINGUISTIC EVENTS IN SPONTANEOUS SPEECH BY NON-NEGATIVE MATRIX FACTORIZATION AND LONG SHORT-TERM MEMORY

Audio/Visual Detection of Non-Linguistic Vocal Outbursts

Added: 19. 6. 2011 17:19