Entities
Video
Rasa can be trained to detect the intent of an utterance, but it can also detect entities within an utterance. An entity can be any important detail that your assistant could use later in a conversation. This includes:
- Numbers
- Dates
- Country names
- Product names
For example, when you have an utterance like:
I would like to book a flight to Sydney
Then we'd like to detect Sydney
as an entity of type "destination" with
a value of "Sydney."
Training Data
You can give examples of entities in your nlu.yml
file. Here's an example.
nlu: - intent: inform examples: | - My account number is [1234567890](account_number)
Detecting Intents
Pre-Built Models
Rasa provides a few built-in methods to extract entities from 3rd parties. You could use the pre-built models provided by Duckling or Spacy. Duckling is generally quite good for extracting numbers, dates, urls and email adresses. SpaCy, on the other hand, provides machine learning models that are pretrained to detect names, locations and organistions. You're free to mix and match though. Nothing is stopping you from using both Duckling and spaCy in a single pipeline.
That means that an utterance like:
I am looking for a flight to Canada that is under $500
Can be parsed by spaCy to detect that "Canada" is a location while Duckling detects that "500" is a number.
In order to detect entities with the provided models, you will need to make sure that your machine learning pipeline is correctly configured as well. But we'll discuss that in a later segment.
Regex
You can also use a regex to detect an entity. This is a great option when you're dealing with phone numbers, postcodes or account numbers.
Here's an example of a regex pattern that's added to the training data.
nlu: - intent: inform examples: | - My account number is [1234567890](account_number) - regex: account_number examples: | - \d{10, 12}
You can also use lookup tables to generate case-sensitive regular expression patterns.
nlu: - lookup: country examples: | - Afghanistan - Albania - ... - Zambia - Zimbabwe
In order to detect entities with the provided regex patterns, you will need to make sure that your machine learning pipeline is correctly configured as well. But we'll discuss that in a later segment.
Machine Learning
Another option is to train your own machine learning model to detect entities. The base Rasa model, called DIET, is able to handle this for you. The downside of this approach is that you will need to add sufficient training examples but the upside is that you're not limited to what the premade libraries can offer you.
nlu: - intent: check_balance examples: | - I want to check my [savings account](account_type) - Can you show me my [current account](account_type)
The DIET model can train on this data, and provide output in the following format:
{ "entities": [{ "value": "savings account", "start": 20, "end": 33, "confidence": 0.812631, "entity": "account_type", "extractor": "DIETClassifier" }]}
Synonyms
Entities are usually not just detected, they are also used by other
components in Rasa. With that in mind, it makes sense to standardise
your entities by using synonyms. You can define a synonym list in
your nlu.yml
file like so:
nlu: - intent: check_balance examples: | - I want to check my [savings account](account_type) - Can you show me my [credit account](account_type) - synonym: credit examples: | - credit card account - credit account
If DIET were now to see this utterance:
Can you show me my credit account?
It would output the following:
{ "entities": [{ "value": "credit", "start": 20, "end": 33, "confidence": 0.812631, "entity": "account_type", "extractor": "DIETClassifier" }]}
Note that you can also supply synonyms inline, via:
nlu: - intent: check_balance examples: | - I want to check my [credit card account]{"entity": "account", "value": "credit"} - Can you show me my [credit account]{"entity": "account", "value": "credit"}
Roles and Groups
Let's consider this utterance:
I am looking for a flight from New York to Boston
We'd be interested in detecting to two cities, "New York" and "Boston", but we'd also like to detect additional information. "New York" is the origin and "Boston" is the destination. This is also something we'd like to detect.
The inline-syntax of entities can also be used to define role and groups for the entities,
which DIET can then try to predict. Here's what that might look like in the nlu.yml
file:
nlu: - intent: book_a_flight examples: | - I want a flight from [Berlin]{"entity": "location", "role": "origin"} to [SF]{"value": "San Francisco", "entity": "location", "role": "destination"} - ...
This way, our detected data might look like:
{ "text": "Book a flight from Berlin to SF", "intent": "book_flight", "entities": [ { "start": 19, "end": 25, "value": "Berlin", "entity": "city", "role": "departure", "extractor": "DIETClassifier", }, { "start": 29, "end": 31, "value": "San Francisco", "entity": "city", "role": "destination", "extractor": "DIETClassifier", } ]}
Similarity, you can also add "group" information to an entity. In the example below, the group is used to indicate that we're talking about pizza #1.
nlu: - intent: order_pizza examples: | - I want to buy a large pizza with [cheese]{"entity": "toppings", "group": "1"} and [mushrooms]{"entity": "toppings", "group": "1"} - ...
Influencing Stories
Finally, it's good to know that entities can influence conversations. Just like an intent is an element of a story pattern, an entity might also be able to steer the conversation. In the example below, we're adding an extra action if the destination city is London.
stories:- story: The user just arrived from another city. steps: - intent: greet - action: utter_greet - intent: inform_location entities: - city: London role: destination - action: utter_ask_about_london_trip
Links
Exercises
Try to answer the following questions to test your knowledge.
- Can a single word in a sentence be part of two entities?
- What are the three main ways to detect entities in Rasa?