Sound clips of Donald Trump reading the ‘Three Little Pigs’ nursery rhyme aloud and Tom Hanks reciting Pulp Fiction’s ‘Ezekiel 25:17’ may sound realistic, but they were generated by artificial intelligence.
The clips were made using Tortoise TTS (Text-to-Speech), which is capable of replicating a person’s voice after analyzing 20 seconds of an audio clip with them speaking.
And DailyMail.com asked the AI to clone the voices of the former president and actor.
Shashank Jain, a developer who used Tortoise TTS to create the voice-generating tool, said his main idea was to create a tool that allows us to generate podcasts based on text.
‘With the arrival of ChatGPT, we can generate conversations in the format we want, provide the feed to the tool I created and outcomes a podcast between two speakers of our choice,’ he told DailyMail.com.
The sound clips were created with a text-to-speech AI developed by Shashank Jain, who said it was designed to generate podcasts. DailyMail.com had the AI generate Donald Trump’s voice to read ‘The Three Little Pigs’
And just as Microsoft is not releasing its voice-cloning VALL-E due to fears of misuse, Jain also plans to keep Tortoise safeguarded from bad actors.
Using AI to write essays, create music and replicate someone’s voice was once seen as something from a science-fiction film, but is now becoming the way of the world.
Jain shared his technology on Twitter, following Microsoft announcing its VALL-E – he tweeted that the technology already exists.
He said text is first fed to ChatGPT, Microsoft’s popular chatbot, to generate a textual conversation between the two on this topic.
‘Once that is done, the text is fed to my tool, which then creates the podcast based on audio samples of two characters (Musk and Hanks in this case) and text conversation between the two,’ said Jain.
‘My main reason was just to do this as a hobby and not do anything commercial with it.
‘Microsoft VALL-E promises to do the same and architecture wise also uses Transformers architecture underlying.
‘Microsoft has not made its model public yet mainly due to concerns of misuse of voices.’
The tool is capable of replicating a person’s voice after analyzing 20 seconds of an audio clip with them speaking. DailyMail.com also asked the AI to clone Tom Hanks’ voice
Support authors and subscribe to content
This is premium stuff. Subscribe to read the entire article.
SubscribeGain access to all our Premium contents.
More than 100+ articles.