Home
News
You are here

The Google Assistant now sounds significantly better, thanks to the magic of machine learning

By Kaloyan C.

Updated: Oct 09, 2017, 8:44 AM

6comments

Android Google

The Google Assistant now sounds significantly better, thanks to the magic of machine learning

In the past few years since it launched, Google has been very insistent on proving that the Google Assistant isn't just a cute gimmick, but a fully fledged product. Just last week, along with the Pixel 2 siblings (and a whole bunch of other stuff), the company announced two new dedicated smart speakers, which are joining the already huge family of Assistant-enabled devices which includes last year's Google Home, as well as pretty much any Android device running Marshmallow and above.

But expansion without improvement would be next to worthless — especially considering the Assistant (and, let's be honest, all AI assistants in general) could certainly use some more tuning up. And today Google has announced just that: a major improvement to the Assistant's voice synthesis capabilities, which is pretty much the key interface between it and a user.

The improvements are a result of Alphabet's 2014 acquisition of British AI firm DeepMind, which has since developed an audio neural network (it's 2017 so of course it's a neural network) called WaveNet. And while its debut was all the way back in 2016, the company has now announced its integration into the Assistant, hence the improvements. Hear for yourself:

Audio clip from before WaveNet:

Audio clip from after WaveNet:

In short, the differences are subtle but still fairly noticeable — the distinctive choppiness usually associated with computer-generated voices is completely gone, making the speech sound almost natural (proper intonation still seems to be a problem, though). And on the user-invisible side, the generation is also 1,000 faster than the previously used model, which we imagine will have a positive impact on Google's server electricity bill.

Right now, WaveNet has only been trained on U.S. English and Japanese only, so people using other voices won't hear a difference. However, we imagine other languages will also start utilizing the new technology some time soon. Until then, the source link below provides some more before/after examples, as well as a good explanation of how WaveNet actually works, so interested parties may feel free to check it out:

source: DeepMind via Android Police

Our new coffee table book, Iconic Phones, is a stunning visual tribute to the legends in the world of phones, featuring exclusive high-resolution photography, stories, quotes and fun trivia. Pre-order now and save 15% with code: PARENA15

Pre-order now