This AI voice assistant outpaced OpenAI in delivering one of ChatGPT's highly anticipated features

Discover how a new AI voice assistant surpassed OpenAI in delivering one of ChatGPT's most anticipated features.

Jul 25, 2024 - 12:23
This AI voice assistant outpaced OpenAI in delivering one of ChatGPT's highly anticipated features
OpenAI's postponement of ChatGPT's anticipated Voice Mode disappointed many of its fans, but now they may have been surpassed.

OpenAI's postponement of ChatGPT's anticipated Voice Mode disappointed many of its fans, but now they may have been surpassed. Kyutai, a French AI developer, has introduced Moshi, a real-time voice AI assistant. Designed to engage in natural conversations like Alexa or Google Assistant, Moshi utilizes advanced language models such as the Helium 7B model. According to Kyutai, Moshi can speak in various accents and possesses 70 distinct emotional and speaking styles. It can even manage two audio streams concurrently, enabling simultaneous listening and speaking. Kyutai refined Moshi by fine-tuning over 100,000 synthetic dialogues generated through Text-to-Speech (TTS) technology, aiming to imbue it with the subtleties and nuances of human communication. The brand also collaborated with a professional voice artist to enhance Moshi's voice quality.

This AI assistant combines both text and audio training, optimized to function on various backends, including devices like laptops, without relying on cloud interaction. The company promotes this approach as a means to uphold privacy and security by avoiding the transmission of sensitive data over the internet.

Open talk: Kyutai's Moshi promises open-source innovation

Kyutai announced that Moshi will be an open-source project, sharing its model codes and framework to foster innovation. This approach aims to address concerns about safety and ethics associated with closed AI models from larger companies. Supported by backers like French billionaire Xavier Niel, Kyutai is also developing AI audio identification, watermarking, and signature tracking systems for Moshi. These features aim to enhance accountability and traceability in AI-generated content. As Moshi continues to evolve, its impressive voice capabilities could influence competitors to accelerate their own voice-enabled versions of AI assistants like ChatGPT or integrate large language models into existing platforms like Alexa.