The secret of Google's amazing voice recognition revealed: it works like a brain
posted by Victor H. / Feb 19, 2013, 3:27 AM
And while we were impressed with Siri when it first launched on the iPhone 4S in 2011, it was Google’s Voice Search swift and almost flawless voice recognition technology that has set the bar this high for voice.
But how does it work and what makes Google’s Voice Search so good?
We've heard it before and now we get one more confirmation that the inspiration for it comes from the neural networks in our brain. The implementation of the ‘neural network’ started in Jelly Bean and brought a whopping 25% drop in voice recognition errors.
Basically, using such ample cloud processing power, Google can analyze a ton of patterns - which in case of voice are spectograms - and use that to predict new patterns, much like the neurons in the brain would reconnect to accomplish new tasks.
There are a couple of layers in processing speech. First Google tries to understand the consonants and the vowels. That is the foundational layer. Next, it uses those to make intelligent guesses about the words. And then higher.
The same approach is actually applied to image analysis where you try to first detect edges in an image. Then check for edges close to each other to find a corner. Then go higher from there.
It’s all a fascinating revealing piece on the bits and pieces the future will be built on, and if you’re interested you can hit the original article at Wired below for the details.
Posts: 718; Member since: Nov 23, 2012
It's only a matter of time..... Cyberdine!
posted on Feb 19, 2013, 3:32 AM 5
Posts: 7305; Member since: Dec 05, 2009
The difference being that Google isn't interwoven with the military, and isn't likely to be (other than Android phones being used).
posted on Feb 19, 2013, 3:34 AM 1
Posts: 949; Member since: Oct 30, 2012
posted on Feb 19, 2013, 4:28 AM 0
Posts: 6023; Member since: Dec 30, 2010
"And while we were impressed with Siri when it first launched on the iPhone 4S in 2011, it was Google’s Voice Search swift and almost flawless voice recognition technology that has set the bar this high for voice." Siri - when it doesnt or didnt recognize something....it shows or recommends a picture of stallions. Google Voice recognition- it rarely doesnt recognize you. So...are you saying all voice recognition is a gimmick...or just Google's version....also...Google Now...was setup to just give you infomation based on searches and other things you can setup. Siri had an update to do similar after the fact. I'm just curious what you think is a gimmick...
posted on Feb 19, 2013, 6:22 AM 1
Honestly Google voice just destroys s**tVoice by Samsuck I speak with a heavy Asian accent , and surpirsingly Google Voice manages to get things right 95% of the time ! I do like the personality of Siri though , but its slower and Google voice does a much way better job at interpreting your words . SVoice though , its just terrible
posted on Feb 19, 2013, 5:19 AM 5
Posts: 337; Member since: Dec 21, 2012
i agree that S-Voice sucks but I don't agree With you saying "Samsuck" that's disrespect to one of the best smartphone makers I never use S-voice on My Note II Google voice search is better, smoother, and most importantly FASTER Samsung will make everyone's jaw drop announcing the S4 and Note III
posted on Feb 19, 2013, 9:32 AM 1
Posts: 456; Member since: Nov 19, 2011
"It all started materializing to users with the iPhone 4S and Android devices of about that time" Nah, PA as usual skewing the facts towards Apple. Not even close, Google had solid voice commands in 2009 and the fantastic Google Now, well now... Apple had no such service until late '11
posted on Feb 19, 2013, 9:03 AM 0
Posts: 1; Member since: Sep 28, 2013
If you think this technology is "amazing" or "acts like a brain" you haven't really tried it. I say, "Google Ants" it brings up "Google Finance". OK, similar sounds. I say "Google Ant" it says "Google aunt." OK, again, similar sound. I said, "Google Ant A-N-T" (spelling it out) and it gives me some random stock price. Tried something else. "Google Insect" well that works, but probably only because there is nothing similar to it. I try "Insect Ant" and it says "Insect Aunt", so it's clearly incapable of using context clues to determine what I said. From here I tried clicking the words. Sure enough there was a drop down to select other options, with Ant being one of them. Unfortunately, clicking it doesn't do anything. Once I click anything else or click outside it the word is replaced with aunt again. If only it could at least allow me to replace the word and show new results it would be useful, but alas, that doesn't work. I say, "Baa Baa Black Sheep" and it thinks "Bob, Bob, Black Sheep." (Note, contrary to what the subtext says in this post, the word sheep is NOT offensive. Nothing wrong with fluffy animals or nursury rhymes. If I get banned for saying it, fine.) I say, "Google Amaze" and it thinks I said "Google Mail". I say, "Google Maze" and it hears that, but what if I wanted to know about the crop? Sadly, similar sounding words just wouldn't work. Shouldn't it be able to take information from the camera to put it into context though? I tried just saying "ant" again and it says "aunt" so I say "No! A-N-T" and it searches for "No estan team" (WTF!?). See, one main problem with the system is that it is incapable of learning my speech patterns or recognizing and learning from mistakes. If it REALLY worked like a brain it would have a build in system for recognizing errors. If I say the same word repeatedly with a tenser and tenser tone chances are it didn't understand me, but it just can't learn to understand me Oh, now this is funny: If I use profanities it will search for just the first letter. Like F***. I said that one more clearly, and it searched for Bach. Anyway, they clearly have a word filter system, but even similar sounding words tend to get filtered half the time. This is a slippery slope, because it is up to google to decide what is offensive. My conclusion: Google voice search will only work for half the time and only for some of the most common phrases. It will never learn to adjust itself to you, and will never understand similar sounding words that are less common. It won't have decipher what you are probably saying based on the context or usage. It has possible uses, but those uses are very limited. It's only real strength is that when it does understand you it is capable of quickly delivering results and can even handle a long sentence. Also, it doesn't appear to be affected by music in the background.
posted on Sep 28, 2013, 12:41 AM 0
Send a warning to post author
Send a warning to Selected user.
The user has 0 warnings currently.
Next warning will result in ban!
Ban user and delete all posts
Message to PhoneArena moderator (optional):