[DeepMind] Gemini Omni: A New Era in Video Creation

Gemini Omni Flash is a new model that can transform any input into video. As part of the Gemini series, Omni allows users to combine images, audio, video, and text to generate high-quality videos grounded in Gemini's real-world knowledge. Users can easily edit their videos through natural language, with all instructions building upon previous content.

Here are some unique features of Omni:

Edit videos through conversation: Users can modify videos using natural language, ensuring consistency in characters and adherence to physical laws, with scenes recalling previous content.

Example: When the person touches the mirror, make the mirror ripple beautifully like liquid.
Transform the environment: Users can change specific elements or the entire scene, creating footage they could never film themselves.

Example: Put a black and white checkerboard room inside a glass sphere that floats, tracking above the hand, containing a recursive representation of the same hand holding the sphere.
Create realistic scenes based on Gemini knowledge: Omni combines an intuitive understanding of physics with Gemini's knowledge of history, science, and cultural context, enhancing storytelling capabilities.

Example: Create a claymation explainer of protein folding.
Create videos with your own digital avatar: Users can generate videos using their voice and digital avatars, ensuring personalized content.

Gemini Omni Flash is now available to all Google AI Plus, Pro, and Ultra subscribers, and will soon be accessible to developers and enterprise customers via APIs. All generated videos include an imperceptible SynthID digital watermark, ensuring content transparency and verifiability.

Try Gemini Omni and embark on a new creative journey!

Blogger's Review: The launch of Gemini Omni marks a significant shift in the video creation landscape. By combining natural language processing and deep learning technologies, users can not only generate videos seamlessly but also edit and adjust content in real time. This innovation promises unprecedented flexibility for content creators, and its future functionality and applications are highly anticipated.