Today, we introduce Gemini 3.1 Flash Live, our highest-quality audio and voice model to date. It offers the speed and natural rhythm required for the next generation of voice-first AI, providing a more intuitive experience for developers, enterprises, and everyday users.
Key Features
- Enhanced Real-Time Dialogue Capabilities: Gemini 3.1 Flash Live significantly improves response time and natural conversational flow, accessible to developers via the Gemini Live API in Google AI Studio and to enterprises in Gemini Enterprise.
- Reliable Execution of Complex Tasks: Leading with a score of 90.8% on the ComplexFuncBench Audio benchmark, 3.1 Flash Live excels in the Scale AI's Audio MultiChallenge with a score of 36.1%.
- Improved Tonal Understanding: The model better recognizes acoustic nuances such as pitch and pace, dynamically adjusting responses to user expressions of frustration or confusion.
- Multilingual Support: Gemini Live and Search Live now support real-time multimodal conversations in over 200 countries and territories, allowing users to communicate in their preferred language.
- Watermark Technology: All audio generated by 3.1 Flash Live includes an imperceptible SynthID watermark, aiding in the detection of AI-generated content to prevent misinformation.
The launch of Gemini 3.1 Flash Live marks a new advancement in the naturalness and reliability of audio AI, and we look forward to seeing how developers and users will interact and create with it.
Blogger's Review: The release of Gemini 3.1 Flash Live not only enhances the naturalness and responsiveness of audio AI but also provides developers with robust tools for building complex voice agents. This advancement also positively contributes to multilingual support and information security, making it a noteworthy development in the industry.