Voice Assistant in Mobile Apps: YouTube

An in-depth analysis of voice search functionalities in the YouTube application and its performance from the lens of Voice Augmented Experience.

Welcome to a new blog series by Slang Labs, called "Voice Assistants in Mobile Apps". Here we tear down the voice assistant and search functionalities that have been added by businesses to their mobile applications and discuss them in detail. In this edition, we breakdown the voice search feature in 'YouTube' mobile application which surprisingly doesn't have Google Assistant inbuilt.

We believe that it's essential to recognise the trendsetters and show how the world how voice assistants are being integrated with applications and the outcome of it. This analysis will also help developers understand the usage of voice search and assistants' importance in applications. We have already broken down voice features in 'My Jio', 'Gaana', Amazon and Paytm Travel.

YouTube's Voice Search before the overhaul

Old YouTube Voice Search Demo
Old UI of YouTube's Voice Search

Earlier versions of the YouTube app had a mic button in the search bar. On clicking this mic button, users voice query is transcribed to the search bar, verbatim. Whenever users gave free-formed sentences like "I want to watch Naagin", the search ended in no results found. Hindi voice search was a nightmare to even begin with.

YouTube's Voice Search overhaul

YouTube did a major overhaul of their voice feature in January of 2019. This update saw significant UI and functional changes to improve the overall voice search experience.

Visual Changes:

New User eXperience

Screenshot of YouTube's new voice search
NUX for the new Voice Search

The app starts with training users on the new voice search feature. They do this by showing coach mark which says "New ways to search with your voice! Show me trending videos" in a blue dialogue box hovering over the mic button.

UI Change

Demo of YouTube's New Voice Search UI
YouTube's New Voice Search UI

With the new UI, a white overlay with a pulsating red mic takes over the whole screen on clicking the mic. YouTube hasn't forgotten about the dark mode and shows a black screen overlayed with the mic. Right above the mic, there are again hints present like 'Play Charlie Puth' which are personalised to the user. When the user speaks, the utterance is transcribed clearly on the screen and is visible in a large font. We are witnessing this trend of large font size in other apps as well, which are made for the Next Billion Users by Google, e.g., Neighbourly.

Functional Change:

Thought to Action

Earlier, a user had to click on the mic button and then speak out the utterance.  This utterance was then transcribed to the search bar which showed the listings and then a user selected the video from the listing. It was a time-consuming process.

Now, the user has to click on the mic button and speak. For, e.g. "Play A R Rahman", and YouTube directly plays the song, reducing the thought to action latency. This process removes the time spent in browsing for the videos. More time user spends seeing a video, more money they make.

It is important to note that this happens only where the user's intent is evident; for example, 'play'. Other searches like 'Show' still end up opening the list of videos where user can go and select the video.

We also see this pattern in other apps like Gaana, where a voice search by a user results directly into action.

We broke down the specifics here.

Navigation via Voice

One crucial feature that YouTube also enabled was the ability to navigate parts of their app through voice. You can tap the mic and say "Show me my history" and YouTube will take you there. This functionality will help users navigate the app's treacherous hierarchies with a single voice search command, essentially rendering the entire app flat. Currently, not all menus and sub menus are accessible by voice commands. There are various parts of the app that users have to still access by touch.

What’s still missing?

Multilingual support

The ability to do searches and navigation via voice in different Indian languages like Hindi and Tamil is still missing. With 400% YoY increase in Hindi searches, it is necessary to include at least bilingual search functionality.

Navigational support

Currently, voice navigation is a hit or a miss because users are not aware of the voice search's boundaries. YouTube either needs to expand voice navigational support to all parts of the app or inform users about the limitations.

Better NLP capabilities

YouTube can improve the NLP capabilities of the voice search to provide a better voice search experience. Efforts need to be made to allow users to speak free form sentences, essentially enabling them to ask for videos as naturally as possible.

This major overhaul of the Voice UI was a long time coming. Google is witnessing a 270% YoY increase voice search across all its properties in India. We will see a lot more functionality being added to voice search and see this change be replicated across many different apps and not just by Google but with other different brands as well.

Slang allows you to add voice to your apps in the easiest and fastest way possible. Reach out to us at 42@slanglabs.in if you are interested in adding voice to your apps as well.