Artificial intelligence showdown: Google Lens vs Bixby Vision vs Huawei HiVision

Artificial intelligence showdown: Google Lens vs Bixby Vision vs Huawei HiVision
Recently, smartphone manufacturers have been slapping the AI tag to pretty much every feature they have. From cameras to RAM management, supposedly everything uses artificial intelligence and machine learning for better performance and personalized results. Most of these implementations you can’t really interact with, but there’s one you can see working before your very eyes.

We’re talking about real-world object recognition – the ability of a phone to recognize what its camera is being pointed at. The way this works is not as futuristic as you might think, although the results can still be impressive. Put simply, services utilizing the technology compare the image from your phone to a database of tagged images – or in other words, identified and described by labels (usually by humans). The best matches are shown to you as results. Because the databases are too big to fit on your phone, these services need an internet connection to work.

The main players in that field are Google with its Google Lens, Samsung with its Bixby Vision and Huawei with its HiVision. The three rely on different databases: Google has developed its own, Samsung relies heavily on Pinterest for the object recognition functionality, and Huawei has partnered with Microsoft for its own identification. There are some variations in what each app can do which is why we’re focusing on the two main features: recognizing different types of objects and translating text.

Challenge 1: Food recognition


Task 1: Sneaky round fruit


Anyone can recognize a banana, but what about a fruit with a more common shape, like a plum for example? Here’s what we got:


Google Lens is short and to the point. Bixby Vision, on the other hand, was all over the place with this one. While on the screenshot it suggests that the item is a gemstone, other suggestions we didn’t capture included a cricket ball and radish. No matter how many times we tried, it never came to the conclusion that you’re looking at a plum. When it comes to food, on your Huawei device it’s better to use the dedicated food mode, which is powered by a different company and provides more accurate results. In this case, a plum was no match for HiVision.

Task 2: A French breakfast


This one should be relatively easy since the subject (a croissant) has a distinct shape. Let’s see the results:


A respectable showing from all three contenders but Bixby Vision seemed a bit unsure. The words within the circle were changing from one type of pastry to another rapidly which was why it was hard to properly capture the moment it said: “croissant” (you can spot it if you enlarge the middle screenshot). The croissant in question wasn’t with chocolate filling but since smartphones don’t have x-rays (yet?) we’ll turn a blind eye to Huawei’s mistake.

Task 3: Calorie bomb


This is a more challenging task since candy bars often have similar shapes and textures, which is why we decided to split our test subject apart and give the phones a bit more to work with. And they did not disappoint:


This round it was Google Lens’ turn to slip. It seems like Google hasn’t spent much time tagging pictures of desserts so the software assumed the Snickers bar was some other type of multilayered candy. Meanwhile, Bixby Vision and HiVision were quick to point out that that’s a Snickers we’re looking at.

Challenge 2: Animals… kind of 


Task 1: Guess the bird


Granted, what’s shown below is not a real bird but it appears to be close enough to fool the matching algorithms of our AI helpers. Here’s what they showed when faced with a stork garden decoration:


As usual, Bixby is a bit vaguer, but if you look through the image results it suggests, you’ll find quickly enough that the bird is, in fact, a stork.

Task 2: What breed is that dog?


Here things get challenging as dog breeds often look similar which would make it particularly difficult for the AI to give a correct match. That’s especially true with the dog that served as our model and is a breed that’s not really a popular one globally. But first, let’s see what were the phones’ best guesses.


Here, the results really surprised us. We didn’t expect any phone to get it right, but Google Lens nails it once again. Karakachan, also known as Bulgarian Shepherd, is the actual breed of the dog. It’s even more impressive considering the coloring is not typical for the breed which mostly consists of dogs with black and white fur. The other two did fairly well considering the challenging task, their results were acceptable even if technically incorrect.

Task 3: What animal is that plushie?


Time for something more abstract. We saw this goofy-looking sheep plushie and decided to test if the software will be able to recognize what animal it is despite the weird proportions. The results were hit and miss.


In its typical style, Google’s result looks like the software is bored with your constant questions and just spits out “sheep”, which is indeed correct. However, we can’t blame the other two apps for suggesting “toy” since the plushie is more a toy than a real sheep, obviously. Still, Bixby Vision had a hard time realizing there’s only one object it needs to recognize and suggested similar images that were of pies and other whipped-cream decorated pastries. At least it’s amusing if not very helpful.

Challenge 3: Products 


A big part of the marketing of these apps is focused on the way they can recognize different products so you can buy them or just get more information about them while you’re on the go. So, we decided to test them with products of various popularity.

Task 1: Mysterious white object


While for most people the AirPods case is easily recognizable, its shape can be tricky for algorithms to correctly identify. Or is it?


Almost perfect results! We say almost because HiVision’s first suggestions were of AirPods lookalikes/knockoffs, which is not ideal. Bixby Vision thought for a second that the case is a bar of soap but quickly got to the right product. It seems the abundance of pictures of the AirPods charging case helps with the recognition quite a bit.

Task 2: Tiny dark lord


This task is both easy and hard. On one hand, the helmet of Darth Vader is one of the most recognizable objects in pop culture. On the other hand, there are thousands of products that use it. So, how accurate can the apps be?


Well, what do you know? Three out of three! The exact keychain with light-up LED eyes was in the top results of each app. Quite impressive. Time for the final round!

Task 3: Cool shades


Now, most sunglasses have a similar shape which would make the task too difficult, which is why we chose a pair with a more distinct look and from a popular brand.


Google Lens and HiVision share the top spot in this one, both suggesting the exact Dolce & Gabbana sunglasses that were in front of them. Samsung’s suggested pair was close enough but still not the one in question.

Challenge 4: Text translation


Real-time text translation is probably the most useful feature these apps provide. Being able to quickly check what a piece of text in a foreign language means can make your trips abroad a lot easier. Time to see how our three AI contenders will perform the task. 

Task 1: A warning sign in German


You’re walking around a park in Germany when you see the sign that clearly says something important. You don’t know any German and you don’t want to get in trouble, so you pull out your phone and let the powers of AI translate it. Here’s what you get:


All three will give you enough information about what the sign is warning you not to do, but the translation on the Huawei slightly edges out the other two for including the word “lead”. The original sign says “All dogs must be walked on a leash! Excluding guide dogs”

Task 2: A warning sign in Japanese


Similar scenario, but this time you’re in Japan. Just checking if you have anything to be concerned of:

Again, pretty clear: if you have a car, that’s not the place to park it. You never know when those firefighting activities will break out! The overlays are far from ideal but they get the point across, which is what matters in this case.

Task 3: Text in French


Time to take things to another level. You have a piece of text in an unknown language and you want to know what it’s about? Well, time for your smartphone to prove how smart it is. You scan the text and here are the results:


I don’t know what’s going on with Bixby Vision here, but if I were Google, I’d want to have a word with Samsung about putting “translated by Google” under that abomination. You can see that both Google Lens and HiVision translate the text well enough that you can understand what the story is about and soak its wisdom. Google Lens gets extra credit for overlaying the translation better, the Huawei one looks a bit like a ransom note. 

Here’s the actual text of the popular fable about the crow and the fox:

 Mr. Crow, sitting in a tree,
Held a piece of cheese in his beak.
Mr. Fox, mouth watering from the scent,
Uttered almost precisely this to him:
“Hey! Good morning, Mr. Crow.
How lovely you are! You look so beautiful!
Without lying, if your songs
Are in keeping with your feathers,
You are the Phoenix of the inhabitants of these woods.
”With these words the Crow feels nothing but delight.
And to show off his beautiful voice,He opens a wide beak and lets his prey fall.
The Fox grabs it and said: “My dear sir
Learn that every flatterer
Lives at the expense of the one who listens to him.
This lesson is worth a piece of cheese, no doubt.”
The Crow, ashamed and embarrassed,
Swore, but a bit late, that he would never be fooled again.

Final thoughts 


Time to talk about how it feels using each of the apps. Google Lens is the most intuitive one: once it recognizes an object, a dot shows up, you tap it and get more information. Sometimes, however, it would just continue scanning without picking up the object that’s right in front of you. But scanning the object from a different angle may help. Overall, it’s currently the most polished and useful app of the three we tested.

Samsung’s Bixby Vision is a hit or miss – but mostly miss. It took the most time to get accurate results, which we’re aware of only because we already knew what the correct answer was. If you’re actually relying on Bixby Vision to identify something for you, then luck will be a big factor. Suggestions sometimes change multiple times a second and they vary wildly between all sorts of objects. It would be better if the app just chooses one answer and sticks with it, even if it’s the wrong one, instead of throwing random words at you hoping to get it right eventually.

Huawei’s HiVision did quite well on our tests and can definitely be useful in certain situations. Sometimes, though, it can give a bit too much information. If you have an object on a table you don’t need the app to tell you that there’s a table in the picture as well, or that there’s a hardwood floor in the background. Still, that’s a minor annoyance. What the developers need to work on is a more pleasing design. Those transparent text boxes look very dated and give a sort of gimmicky vibe to the whole app, which is unfortunate.

The good thing about this type of software is that the longer it exists and the more people use it, the better it gets. And if now we’re already seeing some pretty good results, then imagine what will be possible in a few years. We don’t want to get into the creepy territory too much but it’s not impossible to someday be able to point your phone at a person and get their names and e-mail as a result. Still, it’s exciting to see where the technology will get us and a similar test in a year or two could say a lot about how fast things are moving. Stay tuned!

FEATURED VIDEO

10 Comments

1. pokharkarsaga

Posts: 554; Member since: Feb 23, 2012

" The Huawei one looks a bit like a ransom note ". ROFL!!!!!!!

2. cmdacos

Posts: 4264; Member since: Nov 01, 2016

If you rely on any AI, use Google. You'll be far less disappointed.

3. itz_charlie01

Posts: 8; Member since: Aug 15, 2017

I've got to say this whole Ai thing has come a long way from being a gimmick to an app that's actually useful. I remember trying it a year ago heck, I didn't get this much info outta it.

4. Pigaro

Posts: 87; Member since: May 15, 2016

A dead flying insect I didn't know the name, so I tried lens on it & it gave me its name & I was so happy because my friends & I have been looking for its name a month now. Thanks Google.

5. CDexterWard

Posts: 85; Member since: Feb 05, 2018

Great article. Wanted to see something like this for a while. Love that we can now have an ocr / translator in our pockets. Remember when people would be fumbling around with foreign language dictionaries trying to figure out one word at a time?

6. apple-rulz

Posts: 2195; Member since: Dec 27, 2016

Samsung needs to stop pumping resources into Bixby, it’s beating a dead horse. Bixby would only make sense if Samsung was developing their own OS (check what OS the Note 10’s have before countering with Tizen), but they’re not.

9. cmdacos

Posts: 4264; Member since: Nov 01, 2016

How did siri do in this test?

8. oldskool50 unregistered

I would say they all did as well as expected. The first few items are difficult for several reasons. The main one is, there are so many products and things that can look very similar. The Plum. The plum can look like a lot of things. You are asking a phone to see it as a plum and not gemstones that could look similar or some other fruit that has a similar look as well. The dog. You know how many dogs look similar to that? Even I got that bread wrong, because It looks like a fancy croissant. Since I don't live in the country of the brad in the photo, I would not be familiar with it. You chose a few things that were common and chose thinsg that were also specific to a location. if you asked any human what breed of dog that was, unless you are familiar with that breed, you're gonna get it wrong. So I expect the phone will too. if you asked a human what type of bread that was, I think most would just say its a fancy croissant. I would say this was a good comparison. I have use Bixby Vision a few times and I can honestly say ti sucks. Pinterest is a terrible go-to database for anything in general. But I have used it on more common items and it has gotten them right and offers pricings and possible locations for purchase. I think the results are really gonna be up to how well the AI is programmed for your region. Text translation has been hit or miss with Bixby. I always use Google for that. Even Samsung own S-Translator is bad. For example, I have these videos on YT I love to watch. The peopel in the YT video are from Asian countries. One is Japanese, Chinese, Korean and then they invite others. All of them can speak at leats one of the languages equally. So I usually copy the Korean text and I try to get my Samsung to translate thinking since the phone is made by a korean company, it shouldn't get this wrong. yet it has.\ I guess we see who needs to work on their AI. google is simply the best at this because they have been doing it longer, and because they create things from scratch and not use someone else convoluted offering as the background for the info, which is where Samsung and Huawei need to work hardware.

10. cmdacos

Posts: 4264; Member since: Nov 01, 2016

That looks as much like a croissant as a muffin does. Croissants are crescent shaped.

11. Cyberchum

Posts: 1093; Member since: Oct 24, 2012

Interesting and informative article. Google lens it is (not surprised). Huawei (Microsoft) is a close second. Samsung have to step up. In the end, all three need improvements. And I think they'll come.

Latest Stories

This copy is for your personal, non-commercial use only. You can order presentation-ready copies for distribution to your colleagues, clients or customers at https://www.parsintl.com/phonearena or use the Reprints & Permissions tool that appears at the bottom of each web page. Visit https://www.parsintl.com/ for samples and additional information.