Digital Einstein Experience: Fast Text-to-Speech for Conversational AI <BR>(3 minutes introduction)

Digital Einstein Experience: Fast Text-to-Speech for Conversational AI
(3 minutes introduction)

Joanna Rownicka (Aflorithmic Labs, UK), Kilian Sprenkamp (Aflorithmic Labs, UK), Antonio Tripiana (Aflorithmic Labs, UK), Volodymyr Gromoglasov (Aflorithmic Labs, UK), Timo P. Kunz (Aflorithmic Labs, UK)

We describe our approach to create and deliver a custom voice for a conversational AI use-case. More specifically, we provide a voice for a Digital Einstein character, to enable human-computer interaction within the digital conversation experience. To create the voice which fits the context well, we first design a voice character and we produce the recordings which correspond to the desired speech attributes. We then model the voice. Our solution utilizes Fastspeech 2 for log-scaled mel-spectrogram prediction from phonemes and Parallel WaveGAN to generate the waveforms. The system supports a character input and gives a speech waveform at the output. We use a custom dictionary for selected words to ensure their proper pronunciation. Our proposed cloud architecture enables for fast voice delivery, making it possible to talk to the digital version of Albert Einstein in real-time.

InterSpeech 2021

Digital Einstein Experience: Fast Text-to-Speech for Conversational AI
(3 minutes introduction)

Search in Audio

Related Recordings

The INGENIOUS Multilingual Operations App
(3 minutes introduction)

Live Subtitling for BigBlueButton with Open-Source Software
(3 minutes introduction)

InterSpeech 2021

Digital Einstein Experience: Fast Text-to-Speech for Conversational AI (3 minutes introduction)

Search in Audio

Related Recordings

The INGENIOUS Multilingual Operations App (3 minutes introduction)

Live Subtitling for BigBlueButton with Open-Source Software (3 minutes introduction)

Digital Einstein Experience: Fast Text-to-Speech for Conversational AI
(3 minutes introduction)

The INGENIOUS Multilingual Operations App
(3 minutes introduction)

Live Subtitling for BigBlueButton with Open-Source Software
(3 minutes introduction)