x PhoneArena is looking for new authors! To view all available positions, click here.
  • Home
  • News
  • The secret of Google's amazing voice recognition revealed: it works like a brain

The secret of Google's amazing voice recognition revealed: it works like a brain

Posted: , by Victor H.

Tags:

The secret of Google's amazing voice recognition revealed: it works like a brain
Voice recognition technology like Siri and Google’s Voice Search in Android has really gone a long way. It all started materializing to users with the iPhone 4S and Android devices of about that time, and now the leading minds in tech like Apple co-founder Steve Wozniak and Microsoft’s Bill Gates all agree that this is the one thing that has immense potential to actually change the way we interact with our devices.

And while we were impressed with Siri when it first launched on the iPhone 4S in 2011, it was Google’s Voice Search swift and almost flawless voice recognition technology that has set the bar this high for voice.

But how does it work and what makes Google’s Voice Search so good?

We've heard it before and now we get one more confirmation that the inspiration for it comes from the neural networks in our brain. The implementation of the ‘neural network’ started in Jelly Bean and brought a whopping 25% drop in voice recognition errors.

‘"It really is changing the way that people behave." ...When you talk to Android's voice recognition software, the spectrogram of what you've said is chopped up and sent to eight different computers housed in Google's vast worldwide army of servers. It's then processed, using the neural network models built by Vanhoucke and his team.’


Basically, using such ample cloud processing power, Google can analyze a ton of patterns - which in case of voice are spectograms - and use that to predict new patterns, much like the neurons in the brain would reconnect to accomplish new tasks.

There are a couple of layers in processing speech. First Google tries to understand the consonants and the vowels. That is the foundational layer. Next, it uses those to make intelligent guesses about the words. And then higher.

The same approach is actually applied to image analysis where you try to first detect edges in an image. Then check for edges close to each other to find a corner. Then go higher from there.

It’s all a fascinating revealing piece on the bits and pieces the future will be built on, and if you’re interested you can hit the original article at Wired below for the details.

source: Wired

16 Comments
  • Options
    Close




posted on 19 Feb 2013, 03:32 5

1. cezarepc (Posts: 609; Member since: 23 Nov 2012)


It's only a matter of time..... Cyberdine!

posted on 19 Feb 2013, 03:34 1

3. Sniggly (Posts: 7182; Member since: 05 Dec 2009)


The difference being that Google isn't interwoven with the military, and isn't likely to be (other than Android phones being used).

posted on 19 Feb 2013, 07:03 1

11. kostas12ldb (Posts: 51; Member since: 18 Feb 2013)


And how exactly could you be so sure about that?
I actually BET that all data from your phone (not only Android, but iOS, W8, etc) are gathered somewhere to make a nice huge file for you (the user).

posted on 19 Feb 2013, 03:33 8

2. Sniggly (Posts: 7182; Member since: 05 Dec 2009)


That is pretty fricken' sweet. And people say that Google isn't pushing the envelope of technology.

posted on 19 Feb 2013, 03:45 7

4. bayhuy (Posts: 307; Member since: 23 Jun 2011)


Skynet is here! :-))

posted on 19 Feb 2013, 12:53

15. pkiran1996 (Posts: 165; Member since: 22 Oct 2011)


We've gort to geyt out of heyre.

posted on 19 Feb 2013, 04:18 1

5. SonyXperiaNexus (Posts: 374; Member since: 01 Oct 2012)


before long, google's voice search will start to develop a personality by itself...unlike siri which is actually programmed by humans to do so.

posted on 19 Feb 2013, 04:28

6. UrbanPhantom (Posts: 949; Member since: 30 Oct 2012)


Gimmick. *Yawn*

posted on 19 Feb 2013, 04:43 5

8. Nathan_ingx (Posts: 3024; Member since: 07 Mar 2012)


Five years back and i might have believed in you...but nada, this is the real thing. It ain't a gimmick anymore...
Where you been? How long have you been hibernating?

posted on 19 Feb 2013, 06:22 1

10. jroc74 (Posts: 5192; Member since: 30 Dec 2010)


"And while we were impressed with Siri when it first launched on the iPhone 4S in 2011, it was Google’s Voice Search swift and almost flawless voice recognition technology that has set the bar this high for voice."

Siri - when it doesnt or didnt recognize something....it shows or recommends a picture of stallions.

Google Voice recognition- it rarely doesnt recognize you.

So...are you saying all voice recognition is a gimmick...or just Google's version....also...Google Now...was setup to just give you infomation based on searches and other things you can setup. Siri had an update to do similar after the fact.

I'm just curious what you think is a gimmick...

posted on 19 Feb 2013, 04:40 4

7. tedkord (Posts: 5264; Member since: 17 Jun 2009)


But Siri can give you a witty, preprogrammed canned response.

posted on 19 Feb 2013, 05:19 5

9. thelegend6657 (unregistered)


Honestly Google voice just destroys s**tVoice by Samsuck
I speak with a heavy Asian accent , and surpirsingly Google Voice manages to get things right 95% of the time !
I do like the personality of Siri though , but its slower and Google voice does a much way better job at interpreting your words .
SVoice though , its just terrible

posted on 19 Feb 2013, 09:32 1

14. ebubekir26 (Posts: 312; Member since: 21 Dec 2012)


i agree that S-Voice sucks
but I don't agree With you saying "Samsuck"
that's disrespect to one of the best smartphone makers
I never use S-voice on My Note II
Google voice search is better, smoother, and most importantly FASTER
Samsung will make everyone's jaw drop announcing the S4 and Note III

posted on 19 Feb 2013, 07:23 1

12. lyndon420 (Posts: 1785; Member since: 11 Jul 2012)


My voice assistant is a cute little brunette...and I named her b*tch.

posted on 19 Feb 2013, 09:03

13. appleDOESNT.com (banned) (Posts: 456; Member since: 19 Nov 2011)


"It all started materializing to users with the iPhone 4S and Android devices of about that time"

Nah, PA as usual skewing the facts towards Apple. Not even close, Google had solid voice commands in 2009 and the fantastic Google Now, well now... Apple had no such service until late '11

posted on 28 Sep 2013, 00:41

16. Elliander (Posts: 1; Member since: 28 Sep 2013)


If you think this technology is "amazing" or "acts like a brain" you haven't really tried it.

I say, "Google Ants" it brings up "Google Finance". OK, similar sounds. I say "Google Ant" it says "Google aunt." OK, again, similar sound. I said, "Google Ant A-N-T" (spelling it out) and it gives me some random stock price.

Tried something else. "Google Insect" well that works, but probably only because there is nothing similar to it. I try "Insect Ant" and it says "Insect Aunt", so it's clearly incapable of using context clues to determine what I said.

From here I tried clicking the words. Sure enough there was a drop down to select other options, with Ant being one of them. Unfortunately, clicking it doesn't do anything. Once I click anything else or click outside it the word is replaced with aunt again. If only it could at least allow me to replace the word and show new results it would be useful, but alas, that doesn't work.

I say, "Baa Baa Black Sheep" and it thinks "Bob, Bob, Black Sheep." (Note, contrary to what the subtext says in this post, the word sheep is NOT offensive. Nothing wrong with fluffy animals or nursury rhymes. If I get banned for saying it, fine.)

I say, "Google Amaze" and it thinks I said "Google Mail".

I say, "Google Maze" and it hears that, but what if I wanted to know about the crop? Sadly, similar sounding words just wouldn't work. Shouldn't it be able to take information from the camera to put it into context though?

I tried just saying "ant" again and it says "aunt" so I say "No! A-N-T" and it searches for "No estan team" (WTF!?). See, one main problem with the system is that it is incapable of learning my speech patterns or recognizing and learning from mistakes. If it REALLY worked like a brain it would have a build in system for recognizing errors. If I say the same word repeatedly with a tenser and tenser tone chances are it didn't understand me, but it just can't learn to understand me

Oh, now this is funny: If I use profanities it will search for just the first letter. Like F***. I said that one more clearly, and it searched for Bach. Anyway, they clearly have a word filter system, but even similar sounding words tend to get filtered half the time. This is a slippery slope, because it is up to google to decide what is offensive.

My conclusion: Google voice search will only work for half the time and only for some of the most common phrases. It will never learn to adjust itself to you, and will never understand similar sounding words that are less common. It won't have decipher what you are probably saying based on the context or usage. It has possible uses, but those uses are very limited. It's only real strength is that when it does understand you it is capable of quickly delivering results and can even handle a long sentence. Also, it doesn't appear to be affected by music in the background.

Want to comment? Please login or register.

Latest stories