Assistant Builder vs. Building from Scratch
This is Part 3 in a series about the Slang Assistant Builder.
Follow these links to read the previous parts or to skip ahead:
Part 1: Natural Language Processing for Voice Assistants 101
Part 2: Slang Assistant Builder helps you forget about intents and entities
Part 4: How does the Assistant Builder Work?
We have seen earlier in this blog series, using Slang’s assistant builder allows apps to benefit from shared concepts and information across apps in the same or similar domain while allowing the apps to augment their own data to improve accuracy.
In this blog we go into details of the benefits of such a system.
Using an off the shelf packaged domain specific assistant may be easy to implement, however, it is difficult to customize the assistant to achieve better accuracy.
We start with some reasons why we at Slang decided to build a knowledge base and the ways in which our customers can benefit from it.
Some customers may not have data ready to share, for multiple reasons: they may be a newer app and not have a large resource of data. Or for sensitivity reasons, they may not be ready to hand over their data to third party vendors.
If the first two points don’t apply to the customer and they are willing to upload their data onto Slang’s assistant builder, Slang’s knowledge base can act as an additional source of information that may cover items, words or synonyms that the customer may not have as part of their data.
For example, assume a customer in the fruits and vegetables space. Depending on the geographic location or market the customer is in, many fruits and vegetables may just be available there, and therefore the customer may not have the names of such items in their database if they never sell it. However, when using voice, for improved accuracy it helps if the NLU systems understands and has knowledge of a larger set of items from that domain. One reason is that by having knowledge about the larger list from the domain helps the assistant become more confident that the user was talking about an item from that space and can have more informed discussions with users such as it can try to sell alternatives or confidently guide the user to explore other varieties.
The customer may not have synonyms or similar words. Sometimes a customer may be used to expressing concepts of items in certain ways. For example, in India, the translations of words would be used in place of the English words in an otherwise normal English sentence. Chilli in Hindi is Mirchi. A user may speak a sentence such as, ‘I am looking for mirchi.’ Using the knowledgebase, such connections come out of the box even if the customer has only a listing for chillis. Some apps also have different nomenclature for categories, for example what may be the category electronics on one app may be appliances on another app. This kind of fuzzy linkage of possible meanings help the user find what they are looking for even if they use words that may not strictly be correct according to the nomenclature used in that particular app.
Out of the box translation systems such as Google’s translation service are pretty good, but in our experience they struggle with some domain-specific words. We have worked on collecting extensive translations for words specific to a domain. And by having a constant feedback system with usage across apps this library only becomes more robust.
Reasons why the customer would want to supply their data:
One question that may arise from the above discussion about using Slang’s knowledge base would be, why does Slang even need data from the customer, if it is able to achieve high coverage of all the domain-specific words?
The answer is while we aim to have a system that can work out of the box without requiring any data from the customer, augmenting our model with the customer’s data we can further improve the effectiveness of the system.
In some cases we may have missed certain words that are special to the customer’s inventory or geography that were not in our database.
In many cases we have seen there is no standard way of grouping categories and filters. Knowing how the customer’s data is structured helps us return the data in a form that is easy for the customer to consume because it best mirrors their existing structures.
In some hopefully few but important cases we may have words that, in general, could mean one thing for the entire domain, but for the particular customer it means something special. Taking a contrite example, the word apple may mean the fruit for grocery domain, but it can also mean the brand Apple for the electronics domain. But what if the customer’s brand was Apple and you had a grocery business and wanted the word to mean the store and not the fruit?
Read the next part in the series here.
