InterSpeech 2013

On the interaction of social and linguistic factors in phonetic variation in typical and atypical speakers

Benjamin Munson, Department of Speech-Language-Hearing Sciences University of Minnesota, Minneapolis, USA
The speech signal is remarkably rich. As discussed by Munson, Edwards, and Beckman (2012), a single production of the word cat can index not only the regular semantic features of felis catus, but also the word’s position in utterance’s larger prosodic structure, the speaker’s stance toward the topic being discussed, the speaker’s intentions for how the word should be interpreted relative to the ongoing discourse, and aspects of the speaker’s social identity (such as their gender and sexuality) and emotional state. Humans and automatic speech processing systems must be able to unpack these different messages from this complex signal. In this talk, I discuss how different types of information interact in speech production and perception. I give special attention to contrasting typical speakers and listeners with atypical populations, i.e., populations other than native language speaking adults with no history of speech, language, or hearing impairments. Together, the results I present are a ’call to action’ for the INTERSPEECH community to consider a broader set of sources of variability when modeling spoken language production and comprehension.