OpenAI, the creators of ChatGPT, unveiled a new voice cloning tool called “Voice Engine.” Despite the anticipation, the company is holding off on a public release due to concerns about malicious use and a potential rise of fake content online.
OpenAI’s Voice Engine realistic speech that sounds just like the original speaker. This impressive feat is achieved by combining text input with a mere 15-second audio sample. Currently undergoing testing, Voice Engine comes with strict regulations outlined by OpenAI in a blog post. These regulations mandate that partners seeking to use the technology obtain explicit. And inform consent from any individual whose voice they intend to replicate.
According to the corporation, the voice created by AI has to be tagged for the target consumers.
Just over a week after filing a trademark application for the term, AI startup OpenAI debuted this technology on Friday. It creates a realistic voice, OpenAI plans to provide a preview of its technology. This will be to early testers before restricting access, particularly during this election year.
Authorities in New Hampshire are looking into robocalls made to thousands of voters before the presidential primary. The call featured artificial intelligence (AI) voices that mimicked President Joe Biden. Some of the startups selling voice-cloning technology are open to the general public. Whereas, others are only available to a select group of business clients.
How can OpenAI voice engine used?
Content creation in multiple languages: It Maintains the speaker’s voice while translating audio into many languages(e.g., educational content, podcasts).
Helpful technology for non-verbal people: It gives users of AAC devices or those with speech difficulties voices that sound natural.
Voiceovers and narration: It Creates voices for audiobooks, documentaries, or other media projects.
Personalization: It gives people the option to select a voice for virtual assistants or audiobooks that most closely resemble them.
Regaining lost voices: It helps those who, as a result of disease or trauma, have lost their voice.
If so useful, then why hold back?
OpenAI is piloting a program for developers to access their Speech Engine API. This API will allow users to input text and receive AI-generated speech as the output. However, due to ethical concerns, OpenAI has scaled back its plans for a wider release.
Instead of a full release, OpenAI plans to preview Voice Engine technology. They believe this limited access will still showcase the potential of the technology. while highlighting the need for stronger defenses against misuse of generative models like Voice Engine. Safety concerns surrounding AI are the primary reason for this cautious approach.
Other than this realistic voice clones, a product of technological advancements, open the door for malicious applications by criminals. These clones can be wielded to spread misinformation and carry out fraud. Additionally, AI-generated voices circumvent voice-based authentication systems, rendering voice passwords obsolete. OpenAI prioritizes safety in combating the potential damage deepfakes. with audio can inflict on a person’s reputation and well-being. To achieve this, they are developing best practices and security measures before releasing Voice Engine to the public. The company is actively taking a cautious approach to ensure responsible use of this technology.
OpenAI – Sora
OpenAI has developed a generative AI model called Sora. This model specializes in text-to-video generation, meaning it takes your descriptions and turns them into short video clips. You can describe artistic styles, fantastical creatures, or even real-world events, and Sora will bring them to life. However, for real-world scenarios, you might need to provide additional details to ensure accuracy.
Sora creates visually stunning videos, complete with complex camera movements and characters that express a wide range of emotions. It can even extend existing short videos by seamlessly adding new content that matches the original clip. While it’s not available to the public just yet (March 2024), Sora holds a lot of promise for the future of video creation.