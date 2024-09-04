Save $100 on Galaxy Tab S9 FE!

TalkBack can read images even if your phone is offline – thanks to the on-device Gemini Nano

By
0comments
Google News Follow
Follow us on Google News
Android Apps
A screenshot from the Android Developers Blog for the TalkBack functionality.
TalkBack, the indispensable Android feature for people who have blindness or low vision, gets a lot more useful – and powerful – thanks to the Gemini Nano with multimodality model.

There's an extensive blog piece on the Android Developers Blog, where the team opens up about the latest enhancement of the screen reader feature from the Android Accessibility Suite.

Today, thanks to Gemini Nano with multimodality, TalkBack automatically provides users with blindness or low vision more vivid and detailed image descriptions to better understand the images on their screen.

– Android Developers Blog, September 2024

TalkBack includes a feature that provides image descriptions when developers haven’t added descriptive alt text. Previously, this feature relied on a small machine learning model called Garcon, which generated brief and generic responses, often lacking specific details like landmarks or products.

The introduction of Gemini Nano with multimodal capabilities presented an ideal opportunity to enhance TalkBack’s accessibility features. Now, when users opt in on eligible devices, TalkBack leverages Gemini Nano’s advanced multimodal technology to automatically deliver clear and detailed image descriptions in apps like Google Photos and Chrome, even when the device is offline or experiencing an unstable network connection.

Google's team provides an example that illustrates how Gemini Nano improves image descriptions. First, Garcon is presented with a panorama of the Sydney, Australia shoreline at night – and it might read: "Full moon over the ocean". Gemini Nano with multimodality, however, can paint a richer picture, with a description like: "A panoramic view of Sydney Opera House and the Sydney Harbour Bridge from the north shore of Sydney, New South Wales, Australia". Sounds far better, right?

Recommended Stories
Utilizing an on-device model like Gemini Nano was the only practical solution for TalkBack to automatically generate detailed image descriptions, even when the device is offline.

The average TalkBack user comes across 90 unlabeled images per day, and those images weren't as accessible before this new feature. The feature has gained positive user feedback, with early testers writing that the new image descriptions are a “game changer” and that it’s “wonderful” to have detailed image descriptions built into TalkBack
.
– Lisie Lillianfeld, product manager at Google

When implementing Gemini Nano with multimodality, the Android accessibility team had to choose between inference verbosity and speed, a decision partly influenced by image resolution. Gemini Nano currently supports images at either 512 pixels or 768 pixels.

While the 512-pixel resolution generates the first token almost two seconds faster than the 768-pixel option, the resulting descriptions are less detailed. The team ultimately prioritized providing longer, more detailed descriptions, even at the cost of increased latency. To reduce the impact of this delay on the user experience, the tokens are streamed directly to the text-to-speech system, allowing users to begin hearing the response before the entire text is generated.

While I'm not yet boarding the AI hype train fully, AI-powered features like this are stunning – just think about the potential! And then, there are stories like this one that makes you want to tone down this "wonderful" progress of ours:

https://m-cdn.phonearena.com/images/users/334-200/sebastian-square.jpg
Sebastian Pier Junior Tech News Writer
Sebastian, a veteran of a tech writer with over 15 years of experience in media and marketing, blends his lifelong fascination with writing and technology to provide valuable insights into the realm of mobile devices. Embracing the evolution from PCs to smartphones, he harbors a special appreciation for the Google Pixel line due to their superior camera capabilities. Known for his engaging storytelling style, sprinkled with rich literary and film references, Sebastian critically explores the impact of technology on society, while also perpetually seeking out the next great tech deal, making him a distinct and relatable voice in the tech world.

Recommended Stories

Loading Comments...

Popular stories

Tim Cook’s strategy after Steve Jobs’ passing: senior employees who barely work
Tim Cook’s strategy after Steve Jobs’ passing: senior employees who barely work
T-Mobile's secret elite team performs another miracle
T-Mobile's secret elite team performs another miracle
Apple canceled a potential cash cow to keep its reputation, now it might be reconsidering
Apple canceled a potential cash cow to keep its reputation, now it might be reconsidering
T-Mobile will only have a year of exclusive Starlink satellite access
T-Mobile will only have a year of exclusive Starlink satellite access
Amazon Prime members can get the OnePlus 12R high-ender at an astoundingly low price right now
Amazon Prime members can get the OnePlus 12R high-ender at an astoundingly low price right now
Amazon is selling all Samsung Galaxy Watch 7 models with gift cards as deal sweeteners
Amazon is selling all Samsung Galaxy Watch 7 models with gift cards as deal sweeteners

Latest News

The Garmin Epix Gen 2 Sapphire Edition enters your shortlist after sweet 35% discount on Amazon
The Garmin Epix Gen 2 Sapphire Edition enters your shortlist after sweet 35% discount on Amazon
Honor beats Samsung sales even before the thinnest foldable phone launch
Honor beats Samsung sales even before the thinnest foldable phone launch
This epic new OnePlus 12R deal bundles the discounted phone with free OnePlus Buds 3
This epic new OnePlus 12R deal bundles the discounted phone with free OnePlus Buds 3
At whopping 42% off the budget Galaxy Buds FE sell for peanuts
At whopping 42% off the budget Galaxy Buds FE sell for peanuts
Apple Intelligence on iPhone 16 has the highest chance to come to China
Apple Intelligence on iPhone 16 has the highest chance to come to China
Google: "Today we're releasing Android 15"
Google: "Today we're releasing Android 15"
FCC OKs Cingular\'s purchase of AT&T Wireless