Alexa, have a heart! British company develops artificial voice that can speak with ‘deep human emotion’ — and even cry
- The software from Sonantic can imbue dialogue with emotional inflections
- It works by sampling lines recorded by voice actors with different deliveries
- Game and film companies can then use the AI to produce speech in minutes
- The actors are given royalties every-time their synthesised voices are used
A British company has developed an artificial voice that can speak with ‘deep human emotion’ — and even cry — with complete realism.
The digital helpers that we are used to — like Alexa and Google Assistant — tend to speak in close-to monotones, without real inflection to convey emotion.
While this may suffice for voice assistants, such flat computer-generated voices are unsuitable for applications like producing dialogue for video games or film.
However, technology developed by the ten-person team at the London-based firm Sonantic allows the creation of authentic-sounding lines of speech in minutes.
The software can imbue its voices with various characteristics — from panic to sadness and even breathlessness.
A British company has developed an artificial voice that can speak with ‘deep human emotion’ — and even cry — with complete realism (stock image)
‘We create hyper-realistic artificial voices. Unlike other text-to-speech companies, we specialise in subtleties and nuance, giving voice acting on demand, essentially,’ Sonantic chief executive Zeena Qureshi told The Times.
To create each distinct voice, the firm works with actors to record assorted words and sentences spoken with different inflections — from which the AI tool can construct any line as requested for delivery with one of various emotions.
When a game or film production company uses one of Sonantic’s synthetic voices, the actor who helped create such is then given royalties for their contribution.
‘Voice pipelines and entertainment work are quite heavy on logistics, such as casting, editing, directing, booking studios and doing several iterations and there is quite a lot of cost going into that,’ Ms Qureshi told The Times.
‘We can take the process from months down to minutes and spare the hassle of all the logistics involved — and it’s cheaper,’ she added.
One key advantage of the system over using conventional voice-actors comes when last minute dialogue changes are required.
While bringing an actor back in to re-record lines can be time-consuming — especially if conflicting schedules are involved — the AI can deliver new speech in the desired voice within mere minutes.
‘We are using deep learning to really focus on those micro elements of, say, what constitutes sadness,’ Sonantic chief technology officer John Flynn told The Times.
‘So we have the algorithms focus on the intakes of breath and different sort of noises that would happen when someone is crying and the pitches of tone.’
According to the Times, the closure of traditional recording studios during lockdown has seen Sonantic approached by various television and film studios looking for alternative ways to secure the voice-work they require.