The Google Assistant now sounds significantly better, thanks to the magic of machine learning


In the past few years since it launched, Google has been very insistent on proving that the Google Assistant isn't just a cute gimmick, but a fully fledged product. Just last week, along with the Pixel 2 siblings (and a whole bunch of other stuff), the company announced two new dedicated smart speakers, which are joining the already huge family of Assistant-enabled devices which includes last year's Google Home, as well as pretty much any Android device running Marshmallow and above.

But expansion without improvement would be next to worthless — especially considering the Assistant (and, let's be honest, all AI assistants in general) could certainly use some more tuning up. And today Google has announced just that: a major improvement to the Assistant's voice synthesis capabilities, which is pretty much the key interface between it and a user.

The improvements are a result of Alphabet's 2014 acquisition of British AI firm DeepMind, which has since developed an audio neural network (it's 2017 so of course it's a neural network) called WaveNet. And while its debut was all the way back in 2016, the company has now announced its integration into the Assistant, hence the improvements. Hear for yourself:

Audio clip from before WaveNet:


Audio clip from after WaveNet:


In short, the differences are subtle but still fairly noticeable — the distinctive choppiness usually associated with computer-generated voices is completely gone, making the speech sound almost natural (proper intonation still seems to be a problem, though). And on the user-invisible side, the generation is also 1,000 faster than the previously used model, which we imagine will have a positive impact on Google's server electricity bill.

Right now, WaveNet has only been trained on U.S. English and Japanese only, so people using other voices won't hear a difference. However, we imagine other languages will also start utilizing the new technology some time soon. Until then, the source link below provides some more before/after examples, as well as a good explanation of how WaveNet actually works, so interested parties may feel free to check it out:

FEATURED VIDEO

6 Comments

1. Zack_2014

Posts: 677; Member since: Mar 25, 2014

Isn't Bixby the one with the most natural sounding assistant?

3. Eclectech

Posts: 349; Member since: May 01, 2013

you're joking, right? ...I'm just going to laugh anyway. Ha ha

2. surethom1

Posts: 31; Member since: May 01, 2009

Come to English (British) English soon. This is a British Company bought by google so they have not implemented British english at the same time, Shame on you Google.

4. Finalflash

Posts: 4063; Member since: Jul 23, 2013

Yea but British English isn't pronounced as phonetically as US English. It isn't even written as phonetically as US English. So it will be a while before they can get it working with British English.

5. DnB925Art

Posts: 1168; Member since: May 23, 2013

True but General North American English (United States and most of Canada) is the one that has the most English speakers in the world and is also much more homogenous compared to British English. Much larger data set for machine learning.

6. jonathanfiuwx

Posts: 182; Member since: Mar 10, 2017

just because uk and usa are brothers at arms and soul, i can say this shamelessly... USA! USA! USA! USA!

Latest Stories

This copy is for your personal, non-commercial use only. You can order presentation-ready copies for distribution to your colleagues, clients or customers at https://www.parsintl.com/phonearena or use the Reprints & Permissions tool that appears at the bottom of each web page. Visit https://www.parsintl.com/ for samples and additional information.