ASRU 2013

Multilayer perceptrons for speech recognition: There and Back Again

Nelson Morgan (ICSI)

Artificial neural networks have been applied to speech tasks for well over 50 years. In particular, multilayer perceptrons (MLPs) have been used as components in HMM-based systems for 25 years. This presentation will describe the long journey from early speech classification experiments with MLPs in the 1960s to the present day implementations. There will be an emphasis on hybrid HMM/MLP approaches that have dominated the use of artificial neural networks for speech recognition since the late 1980s, but which have only recently gained mainstream adoption.


Nelson Morgan has been working on problems in signal processing and pattern recognition since 1974, with a primary emphasis on speech processing. He may have been the first to use neural networks for speech classification in a commercial application. He is a former Editor-in-chief of Speech Communication, and is also a Fellow of the IEEE and of ISCA. In 1997 he received the Signal Processing Magazine best paper award (together with co-author Herve Bourlard) for an article that described the basic hybrid HMM/MLP approach. He also co-wrote a text (written jointly with Ben Gold) on speech and audio signal processing, with a new (2011) second edition that was revised in collaboration with Dan Ellis of Columbia University. He is the deputy director (and former director) of the International Computer Science Institute (ICSI), and is a Professor-in-residence in the EECS Department at the University of California at Berkeley.