New AI Challenger: French Lab Unveils Voice Assistant With 70 Emotions And Conversation Styles

A new AI assistant is set to challenge ChatGPT, having outpaced OpenAI in launching an AI voice assistant. Moshi, developed by Kyutai, an independent non-profit AI research lab in France, boasts a multimodal model that integrates seeing, hearing, and speaking capabilities with 70 distinct emotions and conversation styles.

Kyutai recently demonstrated Moshi in Paris, showcasing it as the world’s first public real-time generative voice AI. The team spent six months developing Moshi, which can offer advice on topics like climbing Mount Everest and recite poems with a French accent. Kyutai plans to release the model and its research in the coming weeks.

Moshi is positioned against OpenAI‘s GPT-4o, another voice AI model capable of real-time inference and responses. However, GPT-4o’s full voice capabilities won’t be available until the fall. “We believe that Moshi has great potential to change the way we communicate with machines and through machines,” said Patrick Perez, CEO of Kyutai.

Despite expert warnings about AI dangers, numerous startups and tech giants like Anthropic, Cohere, and Google are racing to compete with OpenAI’s GPT-4. In May, OpenAI introduced ChatGPT Plus, a voice assistant with image recognition and fast responses, originally slated for release in a few weeks but delayed until the fall due to feature adjustments.

OpenAI faced backlash for using a voice resembling actress Scarlett Johansson in an AI demo, which they withdrew following legal action from the actress.

Kyutai’s Perez announced plans to release Moshi’s models and research as open source, with the code freely available. He described Moshi as “the first public real-time voice AI assistant.” A statement from Kyutai on Wednesday reiterated the service’s experimental prototype status and promised the release of models and research soon.

Founded in November with €300 million in funding from notable figures like Xavier Niel, Rodolphe Saade, and former Google CEO Eric Schmidt, Kyutai has recruited researchers from Google’s DeepMind and Meta. Chief Science Officer Herve Jegou addressed security concerns, stating that the lab will use indexing and watermarking tools to track audio.

This groundbreaking development positions Moshi as a significant player in the AI voice assistant market, promising to reshape human-machine interactions.

Is Deep Learning Unsupervised Learning? Unraveling the Complex Relationship

Machine Learning VS Deep Learning: Understanding the Core Differences

New AI Challenger: French Lab Unveils Voice Assistant with 70 Emotions and Conversation Styles

Recent Articles

What Is the Main Purpose of Microsoft?

What Are the Common Applications of Deep Learning?

Microsoft’s AI Under Siege: New Research Reveals Vulnerabilities in Copilot System

Palantir and Microsoft Unveil Groundbreaking Partnership to Enhance AI and Analytics for National Security

Vision Pro Surges Past 2,500 Native Apps, Narrowing Gap with Meta Quest Ecosystem

TAGS

Related Stories