InterSpeech 2021

Prosodic disambiguation using chironomic stylization of intonation for native and non-native speakers
(Oral presentation)

Xiao Xiao (LPP (UMR 7018), France), Nicolas Audibert (LPP (UMR 7018), France), Grégoire Locqueville (∂’Alembert (UMR 7190), France), Christophe d’Alessandro (∂’Alembert (UMR 7190), France), Barbara Kuhnert (LPP (UMR 7018), France), Claire Pillot-Loiseau (LPP (UMR 7018), France)
This paper introduces an interface that enables the real-time gestural control of intonation in phrases produced by a vocal synthesizer. The melody and timing of a target phrase can be modified by tracing melodic contours on the touch-screen of a mobile tablet. Envisioning this interface as a means for non-native speakers to practice the intonation of a foreign language, we present a pilot study where native and non-native speakers imitated the pronunciation of French phrases using their voice and the interface, with a visual guide and without. Comparison of resulting F0 curves against the reference contour and a preliminary perceptual assessment of synthesized utterances suggest that for both non-native and native speakers, imitation with the help of a visual guide is comparable in accuracy to vocal imitation, and that timing control was a source of difficulty.