Entities

Video


Rasa can be trained to detect the intent of an utterance, but it can also detect entities within an utterance. An entity can be any important detail that your assistant could use later in a conversation. This includes:

  • Numbers
  • Dates
  • Country names
  • Product names

For example, when you have an utterance like:

I would like to book a flight to Sydney

Then we'd like to detect Sydney as an entity of type "destination" with a value of "Sydney."

Training Data

You can give examples of entities in your nlu.yml file. Here's an example.

nlu:
- intent: inform
examples: |
- My account number is [1234567890](account_number)

Detecting Intents

Pre-Built Models

Rasa provides a few built-in methods to extract entities from 3rd parties. You could use the pre-built models provided by Duckling or Spacy. Duckling is generally quite good for extracting numbers, dates, urls and email adresses. SpaCy, on the other hand, provides machine learning models that are pretrained to detect names, locations and organistions. You're free to mix and match though. Nothing is stopping you from using both Duckling and spaCy in a single pipeline.

That means that an utterance like:

I am looking for a flight to Canada that is under $500

Can be parsed by spaCy to detect that "Canada" is a location while Duckling detects that "500" is a number.

In order to detect entities with the provided models, you will need to make sure that your machine learning pipeline is correctly configured as well. But we'll discuss that in a later segment.

Regex

You can also use a regex to detect an entity. This is a great option when you're dealing with phone numbers, postcodes or account numbers.

Here's an example of a regex pattern that's added to the training data.

nlu:
- intent: inform
examples: |
- My account number is [1234567890](account_number)
- regex: account_number
examples: |
- \d{10, 12}

You can also use lookup tables to generate case-sensitive regular expression patterns.

nlu:
- lookup: country
examples: |
- Afghanistan
- Albania
- ...
- Zambia
- Zimbabwe

In order to detect entities with the provided regex patterns, you will need to make sure that your machine learning pipeline is correctly configured as well. But we'll discuss that in a later segment.

Machine Learning

Another option is to train your own machine learning model to detect entities. The base Rasa model, called DIET, is able to handle this for you. The downside of this approach is that you will need to add sufficient training examples but the upside is that you're not limited to what the premade libraries can offer you.

nlu:
- intent: check_balance
examples: |
- I want to check my [savings account](account_type)
- Can you show me my [current account](account_type)

The DIET model can train on this data, and provide output in the following format:

{
"entities": [{
"value": "savings account",
"start": 20,
"end": 33,
"confidence": 0.812631,
"entity": "account_type",
"extractor": "DIETClassifier"
}]
}

Synonyms

Entities are usually not just detected, they are also used by other components in Rasa. With that in mind, it makes sense to standardise your entities by using synonyms. You can define a synonym list in your nlu.yml file like so:

nlu:
- intent: check_balance
examples: |
- I want to check my [savings account](account_type)
- Can you show me my [credit account](account_type)
- synonym: credit
examples: |
- credit card account
- credit account

If DIET were now to see this utterance:

Can you show me my credit account?

It would output the following:

{
"entities": [{
"value": "credit",
"start": 20,
"end": 33,
"confidence": 0.812631,
"entity": "account_type",
"extractor": "DIETClassifier"
}]
}

Note that you can also supply synonyms inline, via:

nlu:
- intent: check_balance
examples: |
- I want to check my [credit card account]{"entity": "account", "value": "credit"}
- Can you show me my [credit account]{"entity": "account", "value": "credit"}

Roles and Groups

Let's consider this utterance:

I am looking for a flight from New York to Boston

We'd be interested in detecting to two cities, "New York" and "Boston", but we'd also like to detect additional information. "New York" is the origin and "Boston" is the destination. This is also something we'd like to detect.

The inline-syntax of entities can also be used to define role and groups for the entities, which DIET can then try to predict. Here's what that might look like in the nlu.yml file:

nlu:
- intent: book_a_flight
examples: |
- I want a flight from [Berlin]{"entity": "location", "role": "origin"} to [SF]{"value": "San Francisco", "entity": "location", "role": "destination"}
- ...

This way, our detected data might look like:

{
"text": "Book a flight from Berlin to SF",
"intent": "book_flight",
"entities": [
{
"start": 19,
"end": 25,
"value": "Berlin",
"entity": "city",
"role": "departure",
"extractor": "DIETClassifier",
},
{
"start": 29,
"end": 31,
"value": "San Francisco",
"entity": "city",
"role": "destination",
"extractor": "DIETClassifier",
}
]
}

Similarity, you can also add "group" information to an entity. In the example below, the group is used to indicate that we're talking about pizza #1.

nlu:
- intent: order_pizza
examples: |
- I want to buy a large pizza with [cheese]{"entity": "toppings", "group": "1"} and [mushrooms]{"entity": "toppings", "group": "1"}
- ...

Influencing Stories

Finally, it's good to know that entities can influence conversations. Just like an intent is an element of a story pattern, an entity might also be able to steer the conversation. In the example below, we're adding an extra action if the destination city is London.

stories:
- story: The user just arrived from another city.
steps:
- intent: greet
- action: utter_greet
- intent: inform_location
entities:
- city: London
role: destination
- action: utter_ask_about_london_trip

Links

Exercises

Try to answer the following questions to test your knowledge.

  • Can a single word in a sentence be part of two entities?
  • What are the three main ways to detect entities in Rasa?

2016-2024 © Rasa.