OpenAI introduces hyper-realistic voice for ChatGPT to select paying users

OpenAI now offers ChatGPT’s hyper-realistic voice to select paying users. Experience more lifelike interactions with ChatGPT’s latest vocal capabilities.

Jul 31, 2024 - 10:38

OpenAI introduces hyper-realistic voice for ChatGPT to select paying users

Now, the wait is over, though not completely.

OpenAI has begun rolling out ChatGPT’s Advanced Voice Mode, offering users their first experience with GPT-4o’s hyper-realistic audio responses. The alpha version is available today to a select group of ChatGPT Plus users, with a broader rollout expected for all Plus users in fall 2024.

When OpenAI first revealed GPT-4o’s voice in May, it astonished audiences with its lifelike and rapid responses, notably resembling the voice of Scarlett Johansson, who voiced the AI assistant in the film "Her." Following the demo, Johansson declined multiple requests from CEO Sam Altman to use her voice and subsequently engaged legal representation to protect her likeness. OpenAI denied using Johansson’s voice but removed it from the demo. In June, the release of Advanced Voice Mode was postponed to enhance safety measures.

Now, the wait is over, though not completely. The video and screensharing features demonstrated in the Spring Update will not be included in this alpha release and will launch at a later date. For now, the groundbreaking GPT-4o voice demo remains just a demo, with some premium users gaining access to this feature.

ChatGPT now can talk and listen: Introducing advanced voice mode

If you’ve tried the existing Voice Mode in ChatGPT, you might notice that OpenAI’s new Advanced Voice Mode is a significant upgrade. Previously, ChatGPT used three separate models for audio tasks: one for voice-to-text conversion, GPT-4 for processing prompts, and another for text-to-voice conversion. In contrast, GPT-4o integrates these functions into a single, multimodal model, which reduces latency and improves the conversational experience. Additionally, GPT-4o can detect emotional nuances in your voice, such as sadness, excitement, or even singing.

In this pilot phase, ChatGPT Plus users will be among the first to experience the hyper-realistic capabilities of OpenAI’s Advanced Voice Mode. While TechCrunch has not yet reviewed the feature, a detailed review will follow once access is granted.

Gradual rollout: OpenAI is releasing the new voice feature gradually to monitor its usage and performance. Alpha group members will receive an alert in the ChatGPT app and an email with instructions on how to use the feature.

Rigorous testing: Since the initial demo, OpenAI has tested GPT-4o’s voice capabilities with over 100 external red teamers across 45 languages. A report on these safety measures and testing results is expected in early August, further demonstrating OpenAI’s commitment to ensuring a safe and effective voice experience.