InterSpeech 2021

Extending the Fullband E-Model Towards Background Noise, Bursty Packet Loss, and Conversational Degradations
(Oral presentation)

Thilo Michael (Technische Universität Berlin, Germany), Gabriel Mittag (Technische Universität Berlin, Germany), Andreas Bütow (Technische Universität Berlin, Germany), Sebastian Möller (Technische Universität Berlin, Germany)
Quality engineering of speech communication services in the full speech transmission band (0–20,000 Hz) is facilitated by the fullband E-model, a planning tool that predicts overall quality on the basis of parameters describing the setting of the service. We presented a first version of this model at Interspeech 2019, which has since then been standardized by the International Telecommunication Union in ITU-T Rec. G.107.2. Whereas that model was limited to predict the effects of speech codecs, random packet loss, and transmission delay, more realistic settings such as ambient background noise, bursty packet loss, as well as interactive conversational degradations could not be predicted. Based on the results of two new listening-only and conversational tests, we present an approach to extend the E-model to better predict these effects in the present paper. The results show that background noise effects at both sending and receiving side can be predicted well, whereas bursty packet loss predictions still have some limitations which result from the available database. Finally, approaches from conversational analysis help to better predict the effects of delay on conversational quality.