NLU
How it works
The Botpress NLU module will process every incoming messages and will perform Intent Classification, Language Identification, Entity Extraction and Slot Tagging. The structured data that these tasks provide is added to the message metadata directly (under event.nlu
), ready to be consumed by the other modules and components.
QnA: A simple use-case for bots is to understand a question and to provide an answer automatically. Doing that manually for all the questions and answers using the NLU module and the flow editor would be a tedious task, which is why we recommend using the QnA module for that instead.
Intent Classification
Intent classification helps you detect the intent of the users. It is a better and more accurate way to understand what the user is trying to say than using keywords.
Examples
User said | Intent | Confidence |
---|---|---|
"I want to fly to Dubai tomorrow" | search_flight | 0.98 |
"My flight is delayed, help!" | faq_flight_delayed | 0.82 |
"Can I bring a pet aboard?" | faq_pet | 0.85 |
Adding an intent
To create a new intent, navigate to the NLU module then click "Create new intent". Give it a friendly name, then hit OK. You should now add "utterances" of that intent – that is, add as many ways of expressing that intent as possible.
Flight Booking Example
- book flight
- i want to book a flight
- i want to fly to new york tomorrow
- show me travel options from montreal to tokyo
# provide as many as you can
Responding to an intent
You may detect and reply to intents by looking up the event.nlu.intent.name
variable in your hooks, flow transitions or actions.
Here's an example of the structure of an incoming event processed by Botpress Native NLU.
{
"type": "text",
"channel": "web",
"direction": "incoming",
"payload": {
"type": "text",
"text": "hey"
},
"target": "AwIiKCRH4gH2GBJgQZd7q",
"botId": "my-new-bot",
"threadId": "5",
"id": 1.5420658919105e+17,
"preview": "hey",
"flags": {},
"nlu": { // <<<<------
"language": "en", // language identified
"intent": { // most likely intent, assuming confidence is within config threshold
"name": "hello",
"confidence": 1
},
"intents": [ // all the intents detected, sorted by probabilities
{
"name": "hello",
"confidence": 1,
"provider": "native"
},
{
"name": "none",
"confidence": 1.94931e-8,
"provider": "native"
}
],
"entities": [], // extracted entities
"slots" : {} // extracted slots
}
}
You can use that metadata in your flows to create transitions when a specific intent is understood inside a specific flow. You can learn more about flows and transitions here.
Example
Confidence and debugging
To enable debugging of the NLU module, make sure that debugModeEnabled
is set to true
in your data/global/config/nlu.json
file.
Tip: In production, you can also use the
BP_NLU_DEBUGMODEENABLED
environment variable instead of modifying the configuration directly.
Example of debugging message
NLU Extraction
{ text: 'they there bud',
intent: 'hello',
confidence: 0.966797,
bot_min_confidence: 0.3,
bot_max_confidence: 100,
is_confident_enough: true,
language: 'en',
entities: []
}
Entity Extraction
Entity Extraction helps you extract and normalize known entities from phrases.
Attached to NLU extraction, you will find an entities property which is an array of System and Custom entities.
Using entities
You may access and use data by looking up the event.nlu.entities
variable in your hooks, flow transitions or actions.
Example of extracted entity:
User said : Let's go for a five miles run
{
/* ... other event nlu properties ... */
entities: [
{
type: 'distance',
meta: {
confidence: 1
provider: 'native',
source: 'five miles', // text from which the entity was extracted
start: 15, // beginning character index in the input
end: 25, // end character index in the input
},
data: {
value : 5,
unit: 'mile',
extras: {}
}
},
{
type: 'numeral',
meta: {
confidence: 1
provider: 'native',
source: 'five', // text from which the entity was extracted
start: 15, // beginning character index in the input
end: 19, // end character index in the input
},
data: {
value : 5,
extras: {}
}
}
]
}
Note: In some cases you will find additional structured information in the extras object
System Entities
Duckling extraction
Botpress Native NLU offers a handful of system entity extraction thanks to Facebook/Duckling for known entity extraction like Time, Ordinals, Date, etc. For a complete list of system entities, please head to Duckling documentation.
At the moment, Duckling is hosted on our remote servers. If you don't want your data to be sent to our servers, you can either disable this feature by setting ducklingEnabled
to false
or host your own duckling server and change the ducklingURL
to the data/global/config/nlu.json
config file.
For instructions on how to host your own Duckling server, please check the Deployment section.
Example
User said | Type | Value | Unit |
---|---|---|---|
"Add 5 lbs of sugar to my cart" | "quantity" | 5 | "pound" |
{
type: 'quantity',
meta: {
confidence: 1,
provider: 'native',
source: '5 lbs', // text from which the entity was extracted
start: 4, // beginning character index in original input
end: 9, // end character index in original input
},
data: {
value : 5,
unit: 'pound',
extras: {}
}
}
Note: Confidence will always be 1 due to the rule based implementation of Duckling
Placeholder extraction (experimental)
Botpress Native NLU also ships a system entity of type any
which is essentially a placeholder. This feature is working but requires a lot of training data. Before identifying slots see slots docs as entity type any
, try to use custom entities.
An example of placeholder entity would be : Please tell Sarah that she's late
Custom Entities
As of today we provide 2 types of custom entities: pattern and list entities. To define a custom entity, head to the Entity section of the Understanding Module in your botpress studio side bar. From there you'll be able to define your custom entities that will be available for any input message treated by your chatbot. Go ahead and click on create new entity
Sensitive Information
Communication between users and bots are stored in the database, which means that sometimes personal information (eg: credit card) may be persisted as well. To avoid that problem, it is possible to tell Botpress that certain entities are not to be persisted. When creating or editing an Entity, there is a small checkbox located in the upper right corner labeled sensitive
.
When checked, the information will still be displayed in the chat window, but the sensitive information will be replaced by *****
before being stored. The original value is still available from event.nlu.entities
Pattern extraction
Once you've created a pattern entity, Botpress Native NLU will perform a regex extraction on each incoming message and add it to event.nlu.entities
.
Example :
Given a Pattern Entity definition with [A-Z]{3}-[0-9]{4}-[A-Z]{3}
as pattern:
Extraction will go like:
User said | Type | Value |
---|---|---|
"Find product BHZ-1234-UYT" | "SKU" | "BHZ-1234-UYT" |
{ name: 'SKU',
type: 'pattern',
meta:
{ confidence: 1,
provider: 'native',
source: 'BHZ-1234-UYT',
start: 13,
end: 25,
raw: {} },
data: {
extras: {},
value: 'BHZ-1234-UYT',
unit: 'string'
}
}
List extraction
List extraction will behave in a similar way. The major addition is that for your entity definition, you'll be able to add different occurrences of your entity with corresponding synonyms.
Let's take Airport Codes as an example:
Extraction will go like:
User said | Type | Value |
---|---|---|
"Find a flight from SFO to Mumbai" | "Airport Codes" | ["SFO", "BOM"] |
;[
{
name: 'Airport Codes',
type: 'list',
meta: {
confidence: 1,
provider: 'native',
source: 'SFO',
start: 19,
end: 22,
raw: {}
},
data: {
extras: {},
value: 'SFO',
unit: 'string'
}
},
{
name: 'Airport Codes',
type: 'list',
meta: {
confidence: 1,
provider: 'native',
source: 'Mumbai',
start: 26,
end: 32,
raw: {}
},
data: {
extras: {},
value: 'BOM',
unit: 'string'
}
}
]
Slots
Slots are another major concept in Botpress NLU. You can think of them as necessary parameters to complete the action associated to an intent.
Slot Tagging
Botpress Native NLU will tag each words (tokens) of user input. If it's correctly identified as an intent slot it will be attached to NLU extraction event. Each identified slot will be accessible in the event.nlu.slots
map using its name as key.
To define a slot for a particular intent, head to the Intent section of the Understanding Module in your Botpress Studio side bar. From there select the intent you want to add slots to, then you'll be able to define your slots. Go ahead and click on create a slot
Let's use a find_flight
intent. In order to book a flight, we'll define 2 slots: airport_from
and airport_to
both associated with the Airport Codes
custom list entity. Once that is done, we need to identify every airport slots.
Example
User said : I would like to go to SFO from Mumbai
event.nlu.slots
will look like
slots : {
airport_to: {
name: 'airport_to',
value: 'SFO', // shorthand for entity.data.value
entity: [Object] //detailed extracted entity
},
airport_from: {
name: 'airport_from',
value: 'BOM', // shorthand for entity.data.value
entity: [Object] //detailed extracted entity
}
}
Slot Filling
Slot filling is the process of gathering information required by an intent. This information is defined as slots as we mentioned in the above section. Previously, slot filling was made manually and would result in a lot of manipulation. Since 11.8 you can use the Slot skill to help with slot filling. Please refer to the Slot Skill tutorial for further details.
Language Server
The language server provides additional information about words, which allows your bot to understand words with a similar meaning even if you didn't specifically taught it about it. By default, your Botpress server will query one of our language server for that purpose. You can also choose to host your own server if you would like to keep everything on your premise. Head over to the Hosting page for more details.
External NLU Providers
Botpress NLU ships with a native NLU engine (Botpress Native NLU). The advantage of using Botpress NLU is that it is fast (both at training and evaluation time), secured (doesn't hit the cloud), predictable (you can write unit tests, the model resides on your computer) and free.
If for some reason you want to use an external provider, you can do so by using Hooks and calling the external NLU provider via API. There's a detailed example here
Note: We have dropped support (see why) for two-way synchronization as there were too many issues in doing (and maintaining) that. You'll have to maintain this yourself if you go this way. We're open to contributions for both implementation and maintenance of 3rd party NLU integrations.
Features by Providers
Provider | Intent | Entity | Slot tagging | Lang | Context | Sentiment |
---|---|---|---|---|---|---|
Native | X | X | X | X | X | |
DialogFlow | X | X | X | X | ||
Luis | X | X | X | |||
Recast | X | X | X | X | ||
Rasa | X | X |
If you didn't get any error when starting Botpress for the first time, you can ship this section and move to the Quick Start guide.
Error Training Model
> I see Botpress Native NLU depends on the fastText library to build and run models. On some Linux distributions, you may be required to build it manually. If you get an error like the following, you will need to compile the library yourself.
Mod[nlu][native] Error training model
Prerequisite
If you already have make
and g++
installed, you can skip to the next section, Building
sudo apt update
sudo apt install make
sudo apt install g++
Building
Type these commands to generate the binary for your specific platform:
wget https://github.com/facebookresearch/fastText/archive/v0.1.0.zip
unzip v0.1.0.zip
cd fastText-0.1.0
make
Then edit the NLU config file in data/global/config/nlu.json
and add the fastTextPath
pointing to the fasttext
binary
// ...
"confidenceTreshold": 0.7,
"fastTextPath": "/home/ubuntu/fastText-0.1.0/fasttext"
// ...