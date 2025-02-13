Gemini live gets smarter: screen sharing, video streaming, and improved language understanding
Google is making its Gemini Live AI assistant even smarter and more engaging, with new features like screen sharing and live video streaming, along with significant improvements to its language understanding capabilities.
In an email to users, Google revealed that Gemini Live has been upgraded with its latest AI model, improving its ability to understand different languages, accents and dialects. The update also includes improvements to translation, as well as the ability to use screen sharing and live video streaming.
Gemini Live update email.
And like we all know, the better these AI assistants get the more personal information they need from us, so it is no surprise that Google wants to store users' audio, video, and screen share data in their Gemini Apps Activity. Thankfully, you have the option to turn this off. Currently, only conversation transcripts are saved if you've enabled Gemini Apps Activity.
As for how these new features came to be, they are likely to be powered by the Multimodal Live API, which was released with Gemini 2.0 late last year. This API allows developers to handle all kinds of inputs, including text, audio, and video, and generate text or audio responses.
Going deeper into the matrix
This development is nothing too surprising though, as it aligns with the industry trend towards multimodal AI, where systems can seamlessly process and respond to different types of input, making them more versatile and user-friendly.
Google called Gemini 2.0 the start of the "agent era," where AI can do more on its own. This model is on par with OpenAI's o1, but with the added ability to natively generate images, speech, text, and more.
The first in the lineup is Gemini 2.0 Flash, which is twice as fast as its predecessor, Gemini Pro 1.5. It marks a significant step forward in AI capabilities, moving from the "chatbot era" of simple conversations and content generation to an era of reasoning and independent action.
The "agent era" signifies a shift towards AI that can not only understand and respond to requests but also anticipate needs and proactively complete tasks, making it a more integrated and indispensable part of our digital lives.
These updates to Gemini Live are currently rolling out to users, promising a more intuitive and dynamic AI experience, that's if you are willing to enable them.
This evolution of AI has the potential to revolutionize various sectors, from customer service and education to healthcare and personal productivity. But it also stands as a worrisome step towards progress, as it is one that comes at the cost of privacy.
