The secret of Google's amazing voice recognition revealed: it works like a brain

The secret of Google's amazing voice recognition revealed: it works like a brain
Voice recognition technology like Siri and Google’s Voice Search in Android has really gone a long way. It all started materializing to users with the iPhone 4S and Android devices of about that time, and now the leading minds in tech like Apple co-founder Steve Wozniak and Microsoft’s Bill Gates all agree that this is the one thing that has immense potential to actually change the way we interact with our devices.

And while we were impressed with Siri when it first launched on the iPhone 4S in 2011, it was Google’s Voice Search swift and almost flawless voice recognition technology that has set the bar this high for voice.

But how does it work and what makes Google’s Voice Search so good?

We've heard it before and now we get one more confirmation that the inspiration for it comes from the neural networks in our brain. The implementation of the ‘neural network’ started in Jelly Bean and brought a whopping 25% drop in voice recognition errors.



Basically, using such ample cloud processing power, Google can analyze a ton of patterns - which in case of voice are spectograms - and use that to predict new patterns, much like the neurons in the brain would reconnect to accomplish new tasks.

There are a couple of layers in processing speech. First Google tries to understand the consonants and the vowels. That is the foundational layer. Next, it uses those to make intelligent guesses about the words. And then higher.

The same approach is actually applied to image analysis where you try to first detect edges in an image. Then check for edges close to each other to find a corner. Then go higher from there.

It’s all a fascinating revealing piece on the bits and pieces the future will be built on, and if you’re interested you can hit the original article at Wired below for the details.

source: Wired

FEATURED VIDEO

16 Comments

1. cezarepc

Posts: 718; Member since: Nov 23, 2012

It's only a matter of time..... Cyberdine!

3. Sniggly

Posts: 7305; Member since: Dec 05, 2009

The difference being that Google isn't interwoven with the military, and isn't likely to be (other than Android phones being used).

11. kostas12ldb

Posts: 51; Member since: Feb 18, 2013

And how exactly could you be so sure about that? I actually BET that all data from your phone (not only Android, but iOS, W8, etc) are gathered somewhere to make a nice huge file for you (the user).

2. Sniggly

Posts: 7305; Member since: Dec 05, 2009

That is pretty fricken' sweet. And people say that Google isn't pushing the envelope of technology.

4. bayhuy

Posts: 320; Member since: Jun 23, 2011

Skynet is here! :-))

15. pkiran1996

Posts: 166; Member since: Oct 22, 2011

We've gort to geyt out of heyre.

5. SonyXperiaNexus

Posts: 374; Member since: Oct 01, 2012

before long, google's voice search will start to develop a personality by itself...unlike siri which is actually programmed by humans to do so.

6. UrbanPhantom

Posts: 949; Member since: Oct 30, 2012

Gimmick. *Yawn*

8. Nathan_ingx

Posts: 4769; Member since: Mar 07, 2012

Five years back and i might have believed in you...but nada, this is the real thing. It ain't a gimmick anymore... Where you been? How long have you been hibernating?

10. jroc74

Posts: 6023; Member since: Dec 30, 2010

"And while we were impressed with Siri when it first launched on the iPhone 4S in 2011, it was Google’s Voice Search swift and almost flawless voice recognition technology that has set the bar this high for voice." Siri - when it doesnt or didnt recognize something....it shows or recommends a picture of stallions. Google Voice recognition- it rarely doesnt recognize you. So...are you saying all voice recognition is a gimmick...or just Google's version....also...Google Now...was setup to just give you infomation based on searches and other things you can setup. Siri had an update to do similar after the fact. I'm just curious what you think is a gimmick...

7. tedkord

Posts: 17303; Member since: Jun 17, 2009

But Siri can give you a witty, preprogrammed canned response.

9. thelegend6657 unregistered

Honestly Google voice just destroys s**tVoice by Samsuck I speak with a heavy Asian accent , and surpirsingly Google Voice manages to get things right 95% of the time ! I do like the personality of Siri though , but its slower and Google voice does a much way better job at interpreting your words . SVoice though , its just terrible

14. ebubekir26

Posts: 337; Member since: Dec 21, 2012

i agree that S-Voice sucks but I don't agree With you saying "Samsuck" that's disrespect to one of the best smartphone makers I never use S-voice on My Note II Google voice search is better, smoother, and most importantly FASTER Samsung will make everyone's jaw drop announcing the S4 and Note III

13. appleDOESNT.com

Posts: 456; Member since: Nov 19, 2011

"It all started materializing to users with the iPhone 4S and Android devices of about that time" Nah, PA as usual skewing the facts towards Apple. Not even close, Google had solid voice commands in 2009 and the fantastic Google Now, well now... Apple had no such service until late '11

16. Elliander

Posts: 1; Member since: Sep 28, 2013

If you think this technology is "amazing" or "acts like a brain" you haven't really tried it. I say, "Google Ants" it brings up "Google Finance". OK, similar sounds. I say "Google Ant" it says "Google aunt." OK, again, similar sound. I said, "Google Ant A-N-T" (spelling it out) and it gives me some random stock price. Tried something else. "Google Insect" well that works, but probably only because there is nothing similar to it. I try "Insect Ant" and it says "Insect Aunt", so it's clearly incapable of using context clues to determine what I said. From here I tried clicking the words. Sure enough there was a drop down to select other options, with Ant being one of them. Unfortunately, clicking it doesn't do anything. Once I click anything else or click outside it the word is replaced with aunt again. If only it could at least allow me to replace the word and show new results it would be useful, but alas, that doesn't work. I say, "Baa Baa Black Sheep" and it thinks "Bob, Bob, Black Sheep." (Note, contrary to what the subtext says in this post, the word sheep is NOT offensive. Nothing wrong with fluffy animals or nursury rhymes. If I get banned for saying it, fine.) I say, "Google Amaze" and it thinks I said "Google Mail". I say, "Google Maze" and it hears that, but what if I wanted to know about the crop? Sadly, similar sounding words just wouldn't work. Shouldn't it be able to take information from the camera to put it into context though? I tried just saying "ant" again and it says "aunt" so I say "No! A-N-T" and it searches for "No estan team" (WTF!?). See, one main problem with the system is that it is incapable of learning my speech patterns or recognizing and learning from mistakes. If it REALLY worked like a brain it would have a build in system for recognizing errors. If I say the same word repeatedly with a tenser and tenser tone chances are it didn't understand me, but it just can't learn to understand me Oh, now this is funny: If I use profanities it will search for just the first letter. Like F***. I said that one more clearly, and it searched for Bach. Anyway, they clearly have a word filter system, but even similar sounding words tend to get filtered half the time. This is a slippery slope, because it is up to google to decide what is offensive. My conclusion: Google voice search will only work for half the time and only for some of the most common phrases. It will never learn to adjust itself to you, and will never understand similar sounding words that are less common. It won't have decipher what you are probably saying based on the context or usage. It has possible uses, but those uses are very limited. It's only real strength is that when it does understand you it is capable of quickly delivering results and can even handle a long sentence. Also, it doesn't appear to be affected by music in the background.

17. ansibytecode

Posts: 1; Member since: Aug 02, 2018

Definitely, voice search is a next generation of search engine and Amazon Alexa is playing big role

Latest Stories

This copy is for your personal, non-commercial use only. You can order presentation-ready copies for distribution to your colleagues, clients or customers at https://www.parsintl.com/phonearena or use the Reprints & Permissions tool that appears at the bottom of each web page. Visit https://www.parsintl.com/ for samples and additional information.