An In-depth analysis of the multilingual Voice Assistant inside Amazon's Android app and it's user experience aspects
Welcome to a new blog series by Slang Labs, called "Voice Assistants in Mobile Apps". Here we tear down the voice assistant and search functionalities that have been added by businesses to their mobile applications and discuss them in detail. In this edition, we breakdown the voice search feature in 'Amazon' mobile application which has integrated its own assistant, Alexa.
Till date, most of the attempts to add voice search to e-commerce apps have been half-hearted and half-baked. Most of these companies just tied up the mic button in the search bar and Google's Speech Recognition together and called it a night. Tata, Goodbye :) Well, not Amazon! They went a step ahead and added 'Alexa' to their shopping app.
Voice Search is taking over India. It looks like Amazon took notice of Google's Year in Search report 2018, which showed that 28% of all the searches are happening through voice. That's not all. Voice searches are growing at 270% year on year in India. Hindi voice searches have shown 400% YoY growth. It shouldn't come as a surprise, when Jio added Google Assistant to its Jio Phone, Google Assistant's usage in India jumped by 6X.
We believe that it's essential to recognise the trendsetters and show how they are adding voice assistants inside their applications and its outcome. This analysis will also help developers understand the usage of voice search and assistants' importance in applications. We have already broken down voice features in 'My Jio', 'Gaana', YouTube and Paytm Travel.
We have a standard, easy to understand old school mic icon on the top right side of the Amazon App's title bar. The icon is filled with white colour, which gives a clear contrast to the background colour. Placement of the mic icon at such a prominent place in the app indicates the seriousness and importance of voice search in retail shopping. Amazon has shown trust in Alexa by integrating it in the application. The aim here is to improve the user experience while doing retail shopping.
Amazon has added Alexa's logo in the US application for voice retail shopping instead of the mic button. This is done due to a higher brand recall of Alexa, whose mindshare is much higher in the US when compared to India because of the vast market share of Echo devices running Alexa, a voice assistant by Amazon.
Users are introduced to this voice search mic by a simple coach mark. On clicking the mic button, the user sees a dialogue box which explains what the user can do with voice. On pressing the continue button, the user needs to grant mic permission. Alexa triggers after the permission screen, and the characteristic blue wave appears. You can also look at the videos of onboarding on our YouTube channel.
It is the most significant element on the dialogue box, attracting instant attention. This visual helps to set the context to the user and introduce them to the voice search feature.
There are a couple of utterances shown as an example to train the user. These utterances help in setting the right expectations for the user. Even in the Slang surface, we offer these sentences to guide the user.
Amazon asks the user to enable Mic permissions for this feature. In Slang, we speak out the purpose for the permissions and thereafter show it.
It might not seem like much, but there is quite a bit to breakdown here. Let's get started.
On clicking the mic button, the user gets to see a bluish-green wave at the bottom of the scene with help utterance right above it. This visual helps in setting the context and guiding users on what they can ask and sets their expectations.
While the user is speaking the greenish wave vibrates from the centre-out. This is a good practice since user gets a signal that they are being heard.
When the assistant detects silence or end of the utterance, the greenish waves go all the way till the end. The waves pulsate slightly as well.
While the user is speaking, the wave appears to be more of a horizontal line filled with moving blue colour.
As soon as the user presses the mic icon, Alexa starts listening after a prompt sound, 'ting' which acts as an auditory cue. The same sound, 'ting' is triggered when it stops listening. We have implemented the same thing in Slang as well. We have a different prompt sound when we start and stop listening. This prompt sound helps the user understand when to begin speaking, making the application user-friendly.
After an unrecognised utterance, Amazon shows a help screen which informs the user that it couldn't find what the user aimed to search. It then displays a bunch of statements that the user can speak.
This dialogue box is opaque and has an Alexa button to speak again. Food for thought — why did Amazon not add the mic button here instead? Why add a button which users haven't been exposed to yet, instead of one which they recognise.
Full points to Amazon on the accuracy of their voice assistant. It's highly accurate even when spoken in different accents. It even recognised long brand names and difficult product names (e.g. mamy poko pants) with almost no recognition error which is surprisingly good. Be it long-form product names or product name with just one ordinary word; they have managed to outdo themselves. Accuracy of their voice assistant to comprehend input is top-notch.
Alexa excels over here as well. Breaking down the speed of exact components like ASR and NLP is not possible as it doesn't let you look under the hood. If it did, we would have been able to get more insights. Alexa, in the Amazon Shopping app, is extremely fast. Search results pop up almost in under a second. Kudos to the team at work, which made this happen.
NLP is one area where Amazon is still falling behind. "I want to see toys Alexa" searched for 'Toys Alexa'. Removing fluff and stop words from an utterance is not that difficult of a problem to solve. This area is where voice search in Amazon app has been disappointing. Lack of a good NLP engine brings down the Voice Augmented eXperience in Amazon app.
You can also ask Alexa what you can do with it using speech. It tells you the two things it can do via voice. Instead of just replying with a voice-only answer, it should utilise the screen and show the user all the things it can do.
During voice search onboarding, Amazon tells us two things we can do with the mic icon — Track orders and search items. Since it's Alexa embedded in the app and not just another retail shopping voice assistant, it can do a lot more than tracking and searching orders. This ability feels more creepy than useful. While I am talking to someone with a mic icon clicked, and I speak something that is not a product in Amazon's database, it starts reading out search results that feel awkward and weird.
Sometimes bowing down and saying, 'I did not understand that' is just fine.
Not reproducible, but it even tells me about the weather and random facts without even asking for it. Imagine while shopping, Alexa speaking out…" According to NHS, you should at least wash your hands for 20 seconds".
If you switch to Amazon India app to Hindi, mic icon disappears. Vernacular support is still missing here. India being so linguistically diverse, supporting vernacular languages should be a high priority for assistants that cater to voice-based retail shopping needs.
Although highly subjective, but in my opinion, Alexa's UI could be much better. Right off the bat, the text size in the help window can be increased to improve readability. Hints shown on the screen in the listening mode could be more significant too.
Alexa has hands-down the best voice Augmented eXperience in any eCommerce app. (Probably not for long. *Wink*). Even in its current avatar, it's miles ahead of any other competitor.
Like any other brand, Amazon has been very hush-hush about their usage numbers in Alexa inside Amazon Shopping app. In their blog on "Great Indian Festival" published on 18th October 2020, we got the first glimpse of how many people are using Alexa, a voice assistant inside their app.
In the run-up to the Great Indian Festival, Alexa answered over ~100K requests from customers on the Amazon shopping app. It helped the users navigate through their favourite stores such as the SMB Store, the Great Indian Bazaar, deals, gifting store and the Fun Zone.
Alexa received its highest single-day requests of over 1 million on the Amazon Shopping app to guide customers to their product searches, best deals, bill payments, music, and much more during Prime Exclusive Access.
Slang has recently introduced 'Slang for Grocery'. Its VAX experience is built explicitly for the Grocery and Retail domain. It's an off the shelf integration, which voice enables your app in 4 different languages — English, Hindi, Tamil and Kannada and Malayalam in beta. With 'Slang for Grocery', you can voice-enable your retails grocery app in less than 2 days.
If you would like to add the most accurate multi-lingual voice assistant to your Retails Grocery app in just a couple of days, let us know at firstname.lastname@example.org.