OpenAI introduces natural voice conversations for ChatGPT

ChatGPT's Advanced Voice Mode offers natural conversations with AI, emotion detection, personalized voices, faster responses, and five new voice options for paid subscribers.

Sep 25, 2024 - 15:48
OpenAI introduces natural voice conversations for ChatGPT
OpenAI has stated that AVM has undergone safety testing by external experts since its July release.

OpenAI has extended access to its Advanced Voice Mode (AVM) for ChatGPT, allowing more paid subscribers to use this feature for more natural interactions with the AI. Currently, Plus and Team plan subscribers can take advantage of AVM, with Enterprise and educational users set to gain access starting next week. Once available, users will receive a pop-up notification within the ChatGPT app informing them of the update. However, there's no fixed timeline for full rollout across all regions. It's important to note that AVM is not yet accessible in certain areas, including the EU, UK, Iceland, Switzerland, Liechtenstein, and Norway. Additionally, OpenAI has no current plans to make AVM available to free-tier users, keeping the feature exclusive to paying customers for now.

ChatGPT's Advanced Voice Mode (AVM) introduces a range of features designed to create more natural and interactive conversations with the AI. One of its standout features is the ability for users to interrupt ChatGPT mid-response, making conversations feel more fluid. Additionally, AVM can recognize emotions through vocal tone and adjust its replies to better match the mood or context. This mode also delivers faster response times and offers personalized voice options, along with enhanced pronunciation for non-English words.

As part of the update, AVM now includes five new voices—Arbor, Sol, Maple, Vale, and Spruce—bringing the total voice options to nine. These are in addition to the previously available voices Juniper, Breeze, Ember, and Cove. OpenAI has given these voices nature-themed names to emphasize the feature's goal of making AI interactions feel more organic. One notable absence from this lineup is the Sky voice, which was introduced in an earlier update. OpenAI has since paused Sky after actress Scarlett Johansson expressed concerns about its similarity to her voice, even seeking legal action. OpenAI clarified that Sky is voiced by a different actress.

AVM’s capabilities were first showcased in May during the unveiling of GPT-4o, though the official release came later, in July, to a limited group of invite-only users. As part of the wider rollout, AVM has undergone a design update, replacing the black dots shown in earlier demos with a sleek blue animated sphere. This update aims to create a more visually appealing and modern experience for users.

OpenAI has stated that AVM has undergone safety testing by external experts since its July release. However, because AVM remains a closed-source model, it presents challenges for independent researchers who may want to examine its safety and biases.

AVM’s main competitor is Google’s Gemini Live, launched in mid-August for Advanced subscribers on Android, with plans to expand to iOS and additional languages soon. Gemini Live offers 10 voice options, integrates with Google apps for seamless task management, and supports hands-free conversations, further intensifying the competition in the AI voice technology space.