ASRU 2013

Ouch - Outing Unfortunate Characteristics of HMMs (Used for Speech Recognition)

Jordan Cohen (Spelamode)
Steven Wegmann (ICSI)

Hidden Markov models (HMMs) have been applied to the problem of automatic speech recognition for more than 40 years and today HMMs are used in nearly all commercial and research speech recognition systems - in spite of the fact that HMM-based speech recognition is maddeningly brittle. This presentation will describe Project OUCH (Outing Unfortunate Characteristics of HMMs) whose goal is a deep, quantitative understanding of the sources of HMM-based speech recognition errors and brittleness. Finally, as part of this project we interviewed 85 people in the speech and language industry. We will briefly summarize what you told us, and speculate on the impact of those findings.

Bio - Jordan Cohen

Dr. Jordan Cohen is an independent consultant at SPELAMODE, a small company specializing in Speech, Language,and Mobile Devices. He was previously a Senior Scientist at SRI International, where he served as the Principal Investigator for the DARPA GALE program for SRI, coordinating the activities of 14 subcontractors to harness speech recognition, language translation, and information annotation and distillation resources for the government. Prior to that, Jordan was the CTO of Voice Signal Technologies, a company which produced multimodal speech-centric interfaces for mobile devices. He has also worked in the Department of Defense, IBM, and at the Institute for Defense Analyses in Princeon, NJ. Jordan received his PhD in Linguistics from the University of Connecticut, preceeded by a Masters’ Degree in Electrical Engineering from the University of Illinois. He is a member of the Acoustical Society of America and the IEEE, and has published in both the classified and unclassified literature of speech and language technologies.

Bio - Steven Wegmann

Steven Wegmann has worked at industrial research laboratories on problems in speech processing since 1994, holding positions at Dragon Systems, Lernout & Hauspie, VoiceSignal Technologies, Nuance Communications, and Cisco Systems. He has been a staff researcher at ICSI since 2010 and began leading the Speech Group in 2013. His current research interests are in the areas of automatic speech recognition, diagnostic analysis, and low resource spoken term detection. Earlier in his career, he was a mathematician who specialized in algebraic topology. He obtained his doctorate in mathematics at the University of Warwick while he was a Marshall Scholar.