Google's new Gemini AI model can now listen directly to audio files

0comments
Google's new Gemini AI model can now listen directly to audio
Okay, so for the past year and a half we've witnessed the rapid growth of generative AI (is it taking over the world?) and AI models are gaining more and more knowledge as we speak. Now, Android Headlines reports that Google's new AI model, Gemini 1.5 Pro can now listen to audio.

Gemini can now listen to and understand audio files


Maybe you know but the more data you feed AI, the better it becomes (and freakier, if you're one of the more skeptical people). At first, the training of the AI models was basically done via text - especially important for chatbots. However, AI models then learned to process image data, and can now be used to reconstruct an image (or create a whole new image upon your prompt).

Gemini (which used to be called Bard for those of you who don't know) has been able to process images, and now it's growing towards audio format. The version that does that, Gemini 1.5 Pro, is currently in testing. This opens up a world of possibilities - like summaries of a long keynote, conversation, earnings call, lectures, and similar things. You'll be able to upload the file to Gemini.

Tools to summarize long calls exist. But what they do is transcribe the call first and then summarize it. However, Gemini will listen to the call.

Don't be quick to get excited though - for now, this won't be available as a public release. For you to use it, you will need Google's development platform Vertex AI or if you're using AI Studio. It's bound to make it to the public as well, but we don't know when.

All in all, witnessing the growth of AI is seriously exciting. If you're one of the people who fear it will rule the world one day - don't be too scared. The way I see it - it's here to make our lives easier and give us more space to fulfill our potential as intelligent and also intuitive and creative human beings. It will just ensure we won't have to waste precious time with the boring stuff (like listening to a long earnings call, you know).

Recommended Stories

Loading Comments...
FCC OKs Cingular\'s purchase of AT&T Wireless