To appeal to the imagination of users of all ages with an Artificial Intelligence feature developed for WhatsApp: such is the promise that we lived up to, enabling people all over France to easily get a personalized video message from Santa.
The WhatsApp Bot was developed by combining deep-learning, computer vision and text-to-speech technologies. We will explain its step-by-step development:
The messages sent to Santa are processed by text-to-speech technology. First the text is transformed into an mp3 audio file featuring a synthetic voice specially designed for the operation.
We recorded the voice of a real-life actor, and we had him read a whole list of key words which provide the meansto recreate as many sounds of the French language as possible.A specific algorithm developed by Voxygène was then usedto synthesize the voice.
It requires the development of three models: the first one detects a face in a video, the second generates a set of reference points based on a sound sequence, and thethird recreates a new, personalized face animation. -Model 1: We created a first algorithm and trained it to detect human faces on video. -Model 2: We generated a second algorithm able to analyze the phonemes on the audio recordings of these videos and to generate moving reference points accordingly. The objective of that process was to create a dataset and to train a third model. -Model 3: A third algorithm was developed and trained to use the reference points thus created to animate another face with very realistic movements. These models were then applied to the Santa videos produced in-house, and were used to personalize the movements of Santa’s mouth to fit the content of user messages.