Catching Siri: An in-depth look at voice command apps on Android

87comments
Catching Siri: An in-depth look at voice command apps on Android
Obviously, the big hoopla these days is all about Siri and voice commands with your mobile device. Of course, Apple was not the first company to introduce voice commands or dictation on mobile, but Apple has done three very important things with Siri:

  1. Marketed it to the max - This is where Apple shines, and it has done it again with Siri. Google has had Voice Actions on Android for about 2 years now, and even before that there was always the Vlingo app. Siri even started out life as an app on iOS, but Apple bought it, integrated it, and has marketed it as the premier feature of the iPhone 4S. Whereas, many Android users probably still don't even know that Voice Actions exist. 
  2. Used natural language - As we have said, Apple was not the first to put voice commands or dictation to market, but bringing together the natural language AI of Siri with the voice recognition power of Nuance has created an elegant solution that can speed up many mundane tasks. 
  3. Anthropomorphized the iPhone - People tend to feel awkward having a one-sided voice interaction with their phone, but Siri mitigates that issue by replying to requests with a voice of its own. Add on the wit and snark - the personality - and it feels more like interacting with a person than a device. 

These things are important, because these are the points that put Siri ahead of all challengers both in the minds of many consumers (because of the marketing), in actual performance (because of natural language), and give users a more human connection to the product (anthropomorphism). As we've talked about, this was not a feature designed to catch up to what Google was offering, but a system designed to leapfrog all voice command options and give the iPhone 4S a killer feature. 

Recommended Stories
Of course, all that said there are options for Android users looking to approximate what iPhone users have in Siri. There is no one-stop solution just yet, but there are options that can come close, and each caters to the specific needs a user may have. 

Text input/dictation

The first stop on our tour is in text input and dictation. Android users have the built-in Google dictation option, which comes on all Android 2.3+ devices. It does a pretty good job, and can learn over time to understand your voice better. And, as we've seen with the Galaxy Nexus announcement, Android dictation will be real-time starting in Ice Cream Sandwich. Google's offering is certainly good enough for voice commands and searches, but it can be annoying for dictation, mostly due to non-existent auto-formatting. However, as we all know, Siri is powered by Nuance, which has been building its voice recognition database for over a decade with its Dragon Dictate software and other software. And, Nuance does have its own alternative keyboard on Android called FlexT9. 

FlexT9 combines Nuance voice dictation with a Swype-like gesture keyboard, which came from Nuance's acquisition of ShapeWriter. So, in noisy situations, you have the speed of a gesture keyboard, but in quieter situations, you also have dictation which is the most accurate available for Android. Now, even though Nuance powers both FlexT9 and Siri, Apple has been able to get some bonuses which Android users won't find. FlexT9 offers a better experience than Google's stock voice recognition for the simple fact that FlexT9 has a much larger word database, which is filled with tons of proper nouns including companies, celebrities, etc. But, while FlexT9 is great at auto-capitalizing proper nouns (Google dictation doesn't even capitalize proper names), the trick to capitalize other random words by preceding it with the trigger "cap" or "cap next" doesn't exist in FlexT9 and only works for Siri users. Additionally, if FlexT9 doesn't understand what you've said, it likely won't return anything, whereas Siri will return a best guess and allow you to choose alternate options if the best guess is incorrect. 

Also, it should be mentioned while that voice recognition accuracy is largely dependent on the software backing it, a large part is also based on the quality of the microphone and ambient noise filters available on your device. Apple obviously worked hard to have a quality microphone and good noise filters on the iPhone 4S, because even in noisy situations the recognition is fairly accurate. 

Voice command

Dictation is only part of the equation, though. The other side of the coin is voice command. As we mentioned earlier, the big evolutionary feature of Siri is in the use of natural language. Using keyword initiated voice commands have been around for a long time, but it seems likely that Apple avoided this option because it puts a distance between the user and device. Apple has always been determined to make users feel connected to their products. That was the reason behind putting a handle on the top of the original iMac, as well as putting the tapered edge on the iPad to entice users to just "scoop it up" instead of feeling a need to be careful in lifting the device. By using natural language combined with Siri's witty responses, Siri and, by extension, the iPhone 4S itself becomes anthropomorphised a bit and feels more like a personal assistant than just a smartphone. This is something that most Android options can't match, although some are trying. But, for straightforward voice commands, there are a number of options. 

A couple tips to start: all Android voice command apps use the Google voice recognition system. While Google's voice recognition isn't quite as good as Nuance, it is pretty accurate, and it gets better the more you use it, which is a big benefit. And, when you aren't dictating, issues like capitalization don't matter, so it tends to work well enough for voice commands. 

The power of Android: Customization

Another thing to note is that because this is Android that we're talking about, customization options abound. The number one option to look into if you want to jazz up your virtual assistant is with the SVOX app, which allows you to change the default voice of the text-to-speech engine. The default is okay, but definitely quite robotic. SVOX offers 5 different options for US english, and it offers over 40 voices in more than 25 languages in total. You do have to purchase the voices for $3, but you can get a 2-week free trial of any voice, so you can see if you like it or not. 

Another nice option is an app from K&J Software with the uninspired name Voice Control without Internet. We know that with a name like that it doesn't need much explanation, but it basically acts as a limited functionality backup in case you don't have a data connection, but still want the benefits of some voice commands. The app supports just a few commands: send message, check e-mail, open browser, open calculator, make a phone call and Google Map. Of course, without an Internet connection it doesn't seem necessary to open your browser, Google Maps, or check e-mail, but it's a nice start.

Also, there are extra tricks available to you if you set up mobile sharing on Twitter or Facebook. A few of the apps will allow you to update your status with a command, but some won't. However, all have commands to send text messages, so if you go into your account settings on either Twitter or Facebook and set up the mobile phone options, you can update your status with a text message. And, doing this with Twitter is a good idea anyway, because it adds options above just updating your status to be able to follow or unfollow users, DM someone, or retweet a user's newest tweet. Be warned though, these options require texting to a short-code number, which isn't supported by some SMS apps like Google Voice.

Inherent advantages and disadvantages of Android apps over Siri

As we said, Siri was not the first voice command app to hit the mobile ecosystem, but as is often the case with Apple products, Siri took systems that worked "well enough" before and made it into something that connects with people rather than just works. The other big thing that Siri did was disintermediate Google from the search equation. Often, this works well because of integration from Wolfram Alpha and other services, but there are things that are flat out missing from Siri. One big thing is that location-based searches don't work in some regions (like Canada), whereas location based searches on Android are always funneled through Google Maps, which has most of the world covered. Additionally, Google is just more reliable than Siri right now. There have been a number of prolonged Siri server outages since its release, whereas we have never heard of an outage with the Google speech servers, and the only time we couldn't contact Google servers was when we had no data connection.

A disadvantage to most of the Android options is that they are almost all built on Google Voice Search speech recognition, so, aside from any issues you may have with Google's recognition accuracy, almost all of the apps we tested will not work unless you have Google Voice Search installed. The only exception to that rule is Vlingo, which of course predated Google Voice Search (and Siri of course) on mobile platforms, and uses its own speech recognition software on the back end. Additionally, most Android options for voice command are still based on limited keyword and keyphrase sets. Siri mimics natural language recognition by accepting a far wider array of keywords and phrases, and most Android options have yet to catch up on that, though they are trying.

To make this whole process a bit easier, and because there are so many options to cover, we're splitting the results into 3 categories: Don't bother, The Meh, and La Crème. In total, we've gone through 8 different apps all trying to offer the best voice commands on Android, although there are far more than 8 options available in the Android Market. It shook out pretty well too, there are 3 apps in the "don't bother" section, 2 for "meh", and 3 were "La Crème." Let's get this party started!
Don't bother

Eva/Evan

Eva/Evan has the dishonor of being our number one app that you should avoid at all costs. Eva(n) really tries to personify the personal assistant idea, but ends up being a sloppy cash grab. Users can get a free trial of the service through either the Eva Intern or Evan Intern apps. As you can guess, Eva is the female voice option, and Evan the male, but both are the same app otherwise. Not only is this the only app that has a limited free trial before requiring a purchase, after the free trial, the app will cost you $8.99, which is 3 times the cost of the only other voice command app that even has a paid option. 

Eva(n) does try to play up the personal assistant idea, but fails for a few reasons: first, the picture of Eva stares at you with dead eyes; second, it repeats everything you say back to you in a way that reminds more of someone just learning a language rather than an (artificially) intelligent service; and third, it has no real personality. It does offer a number of options for voice commands, but not much that sets it apart from other, better apps. Two commands that only exist in Eva(n) are to create journal entries, and make expense reports. Theoretically, you could say something like "I just spent $30 on gas" and it will build an expense report for you. 

We say "theoretically" because Eva(n) is actually completely useless in practical application. Eva(n) is extremely slow because it not only reads back your every query, but has long delays in finding answers, or even figuring out what your question is. Even though it uses the same Google voice recognition as everything else, it pulls all of the possible phrases along with the most likely match of what you said. This means that no matter what command you give, Eva(n) will give back a list of possibilities for what you said, scroll through them, then more often than not, it will tell you that it understands you, but doesn't know what to do. Or, it will ask you to tap on the command you want, which not only takes up absurd amounts of time, but also completely defeats the point of voice command. It's never good enough to say "I understand," when you can't show that you understand.

One last note isn't specific to Eva(n), but is a good thing to keep in mind: some of the Android voice command apps offer an option to have the app trigger when you shake your phone. This sounds good in theory, but again, in practice it tends to be a bit useless, because the app will be triggered simply by the phone being in your pocket as you walk. That option can be turned off, so that's a good setting to check when you first fire up Eva(n) (assuming you ignore our recommendation to burn this app), or if you choose Jeannie. 

Iris

Iris is best known as the Android Siri alternative that was built in just 8 hours, and that is readily apparent right from the start, even though we reviewed this app after it had been updated (so we're assuming it had at least 16 hours of work done). The two most important features for a voice command app on Android are: 1) mapping to the long-press of the search button, or failing that 2) a widget. Iris has neither even after its recent update. This means the only option for using the app is to leave whatever you're doing and open the app itself, unlike other options which would be available at all times by long-pressing the search button, and some which can be triggered by shaking your phone. The entire point of voice command is speed and efficiency, so not having the most efficient and fastest way to call up the app itself is a deal-killer for us. 

Aside from that failure, Iris does have its strengths. Iris does well for quickly finding specific information or initiating Google searches, and it will read answers to you. By specific information, we mean things like most who/where/when/what questions for data. Iris easily answered when we asked "Who is Charles Barkley?" or "When was the Declaration of Independence signed?" or "What is the population of London?" The new version 2.12 has also added hooks into Google Maps, allowing you to search for nearby places. 

However, outside of those types of questions, Iris can have trouble. It is a bit wonky with math, answering the question "What's 5*38?" with "10 I think, but I'm not good at math." But, it also knows a bit more than Drake and was able to accurately give the square root of 69. More opaque questions won't get you anywhere. We asked "Was Michael Jordan the best basketball player of all time?" To which, Iris answered, "Ask again later. Ask Dancing Alice if it is." And, when we asked, "Who is Dancing Alice?" Iris didn't know. 

But, the main point of voice commands is in the commands, not just being able to get information read to you, because every app that we tested could do everything that Iris can as far as searching, but all had extra commands beyond the basic. Again this is a weakness for Iris. The only options available for commands are to initiate calls, send a text, and lookup of contacts. Even in this, Iris doesn't customize, as all texts are sent through the Messaging app, with no option to use Google Voice or any other SMS app you may prefer. 

Overall though, Iris is fairly impressive for such an early build of an app. Version 2 shows that the team is still working on Iris, but still has a lot of work to do. But, right now, we can't really recommend Iris for much, because you can get more value from most other apps available. 

Cluzee

Cluzee is a hot mess. The app was released just as we were putting the finishing touches on this piece, and received the "praise" of certain websites as the first true Siri competitor on Android. The trouble is, we're pretty sure none of those sites actually tried using the app before doing those write-ups. 

First, Cluzee is a beast. It clocks in at over 20 MB installed and cannot be moved to an SD card. Of course, this won't matter so much to users with newer handsets, but it made testing the app on our Nexus One a bit tricky. It is also a resource hog though. It routinely set off our Watchdog warnings for using up over 65% of the CPU, and slowing down everything. Second, Cluzee is lousy with bugs. It force closes so often that after being in the Android Market for just a day, it had 500 ratings and just 2 out of 5 stars. 

That said, Cluzee is also extremely ambitious. It is designed well, with an intuitive UI. And, on top of the standard commands and queries that most other apps can handle, it claims to offer a number of features that no other can match, such as having a health planner, travel planner, and personal radio (aka weather and news headlines which are read to you). We say "claims to offer" because of course the app has to work before anyone can use said features. When we tried to use the daily planner, the app hung on populating the calendar (even though there were likely no more than one or two things per day at most on the calendar.) Many of the options led to force closes or general slowdown.

The biggest problem with Cluzee isn't that it's a buggy mess, it's that beyond the buggy mess, we can see the start of what looks like a great idea. The addition of personalized recommendations, deals, personal planner, health planner, and headlines adds quite a bit to the standard voice command experience. To a large extent, Cluzee seems to be trying to mesh Vlingo (which Cluzee has almost completely ripped off as far as car mode and UI, just with a color change from blue and black to white and orange,) with the functions found in other options. If Cluzee can clean up its bugs and shrink the app to a more manageable size, there could be something special here, but right now there's no point in even installing it.
The Meh

Next up, we have the apps that are good, but just not good enough for one reason or another. 

Andy

Andy is one of the newer additions to the voice command options on Android, but it certainly doesn't show too much. Andy is similar to Iris in that it gets most of its functions from Google search, such as getting info, performing calculations, currency conversion, getting distances and direction information, etc. Unfortunately, it doesn't push past that too much into more useful things like updating social networks, or with the extra flare of personality.

Andy's best feature is that it can do well in understanding a lot of variations on different queries. This is of course how Siri mimics understanding natural language, but Andy even lists out all of the options in its manual. There are about 30 different ways to ask for your location, about 50 phrases that will work for getting directions and 20 ways to ask how to spell a word. Of course, that last one is more of a gimmick than anything useful, because if you can't spell a word, it's probably too complicated of a word for speech recognition to understand easily.

Overall, Andy is definitely good, and is pushing towards the more natural language side of the voice command equation, but it is lagging behind in terms of personality, and functionality.

Voice Search (Google)

Google's Voice Search seems like the most limited of the apps on the surface, but is really quite powerful, which is why most of the other options are built on top of this app. Unlike other options, Google doesn't try to read answers back to you. If you want to know something, you'll find it in the Google search pages. This is something of a double-edged sword, because on certain queries (like "Who is" questions), it may lead to a longer time between asking a question and getting an answer, because you have to drill into the first search result. But, for many things, like math questions, currency conversions, etc, Google search gives it to you quickly and clearly above other search results anyway, you just have to read. 

Voice Search gives options for Google searches, Maps searches, directions, sending texts or e-mails, making calls, sending yourself an e-mail note, listening to music on your device, or creating alarms. If you don't need advanced options like toggling settings, creating new calendar entries, searching the Android Market or updating various social networks, Google Voice Search can be a solid option. It is also a very old-school Google product in that it is functional, but has no design flare or personality to it. You just say what you want, and it either opens up the search in a browser, Maps, etc. Other apps try to add flare with images or jokes, but Google just gives you information. It is nice though that when sending texts or e-mails or when creating an alarm, Voice Search will let you set your default app, even including 3rd party apps, which is a feature that surprisingly few apps had. 

Ultimately though, Google's option has no personality, and is limited compared to other apps available. It can be very fast, but sometimes drilling down into search results and reading the answer you need isn't fast enough.

La Crème

Voice Actions (Jeannie)

Voice Actions is one of the grandaddies of the voice command apps for Android, and we really struggled with putting it in the "Meh" category. It's been around almost as long as Google's own Voice Search, and it is the best of the bunch when it comes to adding in the personality that you can get from Siri. Once Siri was released, developer Pannous decided to rename Voice Actions as Jeannie (although you can change its name), and has done a great job at adding in little witty responses, and irreverent commands. You can ask Jeannie to read you poetry, or even paint a picture, although the pictures tend to be completely random. For example, when we asked it to paint a picture of Charles Barkley, we received a diagram of a human skeleton with major bones labeled. 

As far as functionality, Jeannie is one of the best option for voice commands on Android, because it has a great array of commands including all of the most popular commands for calling, texting, searching, setting alarms or notifications, and even the deeper functions that Android still has a lead on like toggling WiFi or Bluetooth on and off. Jeannie also has extra features like searching for images or even sounds, and updating your Twitter or Facebook status. There are controls to have Jeannie speak faster, slower or louder. Even commands to make a video, or record audio. Best of all perhaps, you can search the Android Market using the "install ..." keyword. 

The command set is great, the recognition is accurate and can handle a wide variety of phrasing, and responses are fast and often witty (and even faster and wittier if you buy the $2.99 pro version.) The problem with Jeannie is in execution. You can turn off the "shake to wake" option, which is nice, but there is no way to turn off the behavior that wakes the app when you click the play/pause button on a headset, or if you have lockscreen music player controls, that play./pause button with wake Jeannie as well. This means that not only are all of those media controls useless when this app is running, but if you do make the mistake of waking Jeannie, you'll have to unlock your phone and use the back button to put it away. And, that's another problem, often using the home button will not put Jeannie away, just put it in the background, meaning it will still be waiting for input and messing with you until you close it properly. *Update* After this piece was done, developer Pannous contacted us to confirm that the inline control behaviors had been changed. Now, Jeannie will not be triggered unless you double-tap or long-press the headset controls.

Jeannie may even give it our top nod overall because it has one of the best feature sets, and it really does try to inject the humor and personality that makes Siri something unique. Also be warned that there is no widget for Jeannie, but it can be mapped to the long-press on your search button. 
La Crème (cont.)

SpeakToIt

SpeakToIt is actually very similar to Jeannie in a couple of ways. First, where Jeannie injected personality to its app in the spoken responses, SpeakToIt has decided to give you a customizable avatar to connect with. It's a nice touch, especially compared to the photo avatars for Eva(n), and offers a sizable number of options for customization. And, unlike Eva's dead eyes, your SpeakToIt avatar is animated and can show a range of emotion, which helps a lot in creating a connection with the app. Second, like Jeannie, SpeakToIt offers a huge array of commands, but has flaws in execution, although we found the flaws to be less troublesome than Jeannie. 

SpeakToIt does offer great array of voice commands, the most along with Jeannie. And, where Jeannie offers silly things like "paint me a picture" and one special feature with the Android Market search, SpeakToIt has a couple site specific searches which are quite useful like searching IMDb, and Amazon, although the results can be spotty if the voice recognition doesn't understand "IMDb". We had it hear "I am db" a couple times, but overall recognition and searches were on point. Although once, when asking "Who is Adrian Brody", SpeakToIt gave the bio for Stuart Price, which was quite odd. SpeakToIt also offers the option to type your question in addition to the voice commands. 

The only real annoyance that we had with SpeakToIt is that it tries too hard to keep you in the app. Where other apps will pop open the browser or Maps app to handle various searches, SpeakToIt just opens a small window within itself to show you the results, but still allow you to easily ask another question. It can be annoying if you prefer having your search results in full screen app instances, but it's not much of a problem really, and aside from some maps searches, which can be laggy on slower hardware, it often makes a search a bit easier.

While we were creating this piece, SpeakToIt got an update that made it even better. The newest update added 4 voices from iSpeech to choose from - a male and a female in each US English and British English. These voices are much smoother than the stock voice and are free, rather than paying extra for a new voice through SVOX. One slight oddity with the update was that these new voices wouldn't work if the ringer was off on our device. With the phone on vibrate, the voice would be muted, but with the ringer on, the voice worked normally. This seems like a bug that will get ironed out in a future update, or it may turn out to be a feature. It would follow if you don't want your phone to make a noise, you wouldn't want your "personal assistant" to speak to you either. 

Overall, we would give the ultimate recommendation to SpeakToIt though, because despite an odd way of handling requests, and general troubles with initiating commands, it does offer commands on par with Jeannie, but without the more annoying behavioral issues, and we couldn't help but enjoy our animated personal assistant avatar.

Vlingo Virtual Assistant

Vlingo is the one voice command app that predates all the rest on Android. Vlingo was there before Google launched its Voice Search app, although before that Vlingo charged $10 for its app, which dropped to free when Voice Search came along. That extra time in the game definitely shows as Vlingo is by far the most polished app available, and the speech recognition is fast, reliable, and very accurate. Even better, since Vlingo doesn't use Google's speech recognition, you get proper capitalization and automatic punctuation when dictating a message. No more need to say "question mark", "comma", or "period", because Vlingo puts it all in automatically.

Vlingo can handle all of the basic voice commands that you would normally like search, sending messages, making calls, and updating social networks, and it has a couple of extra bells added in to help you book hotels or buy movie tickets. An odd thing here: when searching for hotels, Vlingo will give you the option to search the web or use the Kayak app if you have that installed on your phone. However, when searching for movie tickets, it will search the Fandango website, but it can't use the Fandango app if you have it installed. This very well could be the fault of the Fandango app and not Vlingo, but it's still a bit annoying.

Unfortunately, there are a number of actions which seem like they should be standard, but are conspicuously missing from Vlingo. There are no commands to set alarms or calendar events, no options to open music on your device, or various specific searches, like searching for news. What Vlingo does, it does very well, but it has a noticeably smaller command list than Jeannie or SpeakToIt. Vlingo also has completely ignored the new trend of adding voice read-back and personality to its app. Vlingo will not read answers to you, or make sassy comments. It is pure function.

Vlingo does have the added bonus of having a great hands-free mode. Vlingo can be set to be always-on, and activated when you say "Hey Vlingo". Vlingo has worked together with companies like Jabra to offer better compatibility with the app and various Bluetooth devices, and has an In-Car mode, which will automatically turn on the auto-listen as well as a feature which will speak your messages to you. Although, this feature only works with the standard Messaging app, and will not read messages from Google Voice, or e-mail. Additionally, Vlingo only sends messages through the stock Messaging app with no options to use other SMS apps. Vlingo's message composer is also by far the best available, as it allows you to choose different contacts, or even switch what phone number or e-mail address you send messages to, where other options don't offer that kind of granular control.

Vlingo also excels in the widgets department with 4 different widgets available, if you're into that sort of thing.

Conclusion

As we said, no app on Android completely matches what Siri does, mostly due to the speech recognition and understanding more natural language, but a few come extremely close. And, there are a number of options and commands available in various Android choices that Siri simply can't do. As tends to be the case with Android, there is no shortage of choices when looking for a voice command app. We have been finding ourselves drifting back towards the friendly eyes of our SpeakToIt assistant, although if you find yourself in a car or using a Bluetooth headset more often, or mainly use it to send messages, Vlingo is likely the best choice because of its hands-free mode and quality dictation. But, ultimately it kind of depends on what you are planning to use voice commands for. If you want to search, but don't care about having things read back to you, maybe Google Voice Actions is good enough. Maybe you want some more irreverence and weirdness, and a great feature set, Jeannie is a great option. Overall though, we have to give the nod to SpeakToIt, because it has just as many features as Jeannie, but now includes the extra voices which are much better than the stock voice.

We know that may not be the most helpful conclusion, but cut us some slack. We've been talking to our smartphone for weeks putting this piece together, and we may have forgotten how to deal with real humans. We hope these overviews help, and point you towards the personal assistant that is best for you, because it is a personal choice. We know what we like best. What do you guys like?

Recommended Stories

Loading Comments...
FCC OKs Cingular\'s purchase of AT&T Wireless