NLP.js NLU Integration
Turn raw text into structured meaning with the Jovo Framework integration for the open source natural language understanding service NLP.js.
Introduction
NLP.js is an open source natural language understanding (NLU) library with features like entity extraction, sentiment analysis, and language detection.
Since it is an open source service, you can host NLP.js on your own servers without any external API calls.
You can use the Jovo NLP.js integration for projects where you receive raw text input that needs to be translated into structured meaning to work with the Jovo intent structure. Learn more in the NLU integration docs.
Smaller NLP.js language models are fast to train and can even be used on serverless infrastructure like AWS Lambda without having to use any additional server infrastructure. We recommend taking a close look at the execution times though, as larger models can take quite some time to build.
Installation
You can install the plugin like this:
$ npm install @jovotech/nlu-nlpjs
NLU plugins can be added to Jovo platform integrations. Here is an example how it can be added to the Jovo Core Platform in your app.ts
app configuration:
import { CorePlatform } from '@jovotech/platform-core'; import { NlpjsNlu } from '@jovotech/nlu-nlpjs'; // ... app.configure({ plugins: [ new CorePlatform({ plugins: [new NlpjsNlu()], }), // ... ], });
Configuration
The following configurations can be added:
new NlpjsNlu({ languageMap: { /* ... */ }, preTrainedModelFilePath: './model.nlp', useModel: false, modelsPath: './models', setupModelCallback: (platform: Platform, nlpjs) => { /* ... */ }, }),
languageMap
: Maps locales to NLP.js language packages. The default is an empty object. See language configuration for more information.useModel
andpreTrainedModelFilePath
: NLP.js can take a pre-trained model ifuseModel
is set totrue
. It looks for the model in thepreTrainedModelFilePath
. The default is./model.nlp
.setupModelCallback
: A function that can be passed to set up NLP.js. The first parameter is the currentPlatform
and the second parameter is theNlp
-instance of NLP.js.
Depending on the configuration, NlpjsNlu will try to use the setupModelCallback
if it exists.
Otherwise, the integration will check if useModel
is set to true
, if that's the case, the model is getting loaded from preTrainedModelFilePath
.
If setupModelCallback
does not exist and useModel
is falsy, the integration will attempt to build a model based on the local models and train it.
Language Configuration
For each language you want to use with NLP.js, you need to download the respective language package. Here is an example for en
(English):
$ npm install @nlpjs/lang-en
You can then add this to your languageMap
:
import { LangEn } from '@nlpjs/lang-en'; // ... new NlpjsNlu({ languageMap: { en: LangEn, // ... }, // ... }),
Each key (in the above case en
) represents a locale that can be found in your Jovo Model.
You can find more information about supported languages and available packages in the official NLP.js docs.
Intents
When user input matches a defined intent, the NLU response includes the name in $input.nlu
:
// $input.nlu { "intent": { "name": "YesIntent" } }
When no intent is matched, $input.nlu
will include the None
intent:
// $input.nlu { "intent": { "name": "None" } }
Entities
You can access NLP.js entities by using the $entities
property. You can learn more in the Jovo Model and the $entities
documentation.
The NLP.js entity values are translated into the following Jovo entity properties:
{ value: utteranceText, // what the user said resolved: option, // the resolved value id: option, // same as resolved, since NLP.js doesn't support IDs native: { /* raw API response for this entity */ } }
Jovo Model
You can use the Jovo Model to turn the language model files in your models
folder into an NLP.js model. Learn more about the NLP.js Jovo Model integration here.
See the sample Jovo model and the corresponding NLP.js model.
For NLP.js, the Jovo Model supports the following Intent items:
- Intent only
- Custom enum entities
- Custom regex entities
- Builtin entities
- Combined builtin and custom entities
To learn more, see NER Manager from the NLP.js docs.
Intent only
An intent-only entry includes an array of phrases
:
// Jovo model { "invocation": "my test app", "version": "4.0", "intents": { "YesIntent": { "phrases": [ "yes", "yes please", "sure" ] } } }
Custom enum entities
An intent can include an array of phrases
with {variables}
that map to a key in the entities
object. The developer defines the name of the variable.
For a custom enum entity, the type
is a custom name that you define and a corresponding entry must be added to the entityTypes
section. Custom enum entity types can include synonyms:
// Jovo model { "invocation": "my test app", "version": "4.0", "intents": { "ColorIntent": { "phrases": [ "i pick {color}", "my favorite color is {color}" ], "entities": { "color": { "type": "CUSTOM_COLORS" } } } }, "entityTypes": { "CUSTOM_COLORS": { "name": "CUSTOM_COLORS", "values": [ { "value": "red", "synonyms": [ "crimson" ] }, { "value": "yellow", "synonyms": [] }, { "value": "blue", "synonyms": [] }, { "value": "green", "synonyms": [] }, { "value": "orange", "synonyms": [] }, { "value": "purple", "synonyms": [] } ] } } }
In NLP.js, the list of values for an enum is "closed" meaning that only values (or synonyms) listed will be recognized.
Custom regex entities
An intent can include an array of phrases
with {variables}
that map to a key in the entities
object. The developer defines the name of the variable.
For a regex entity, the type
starts with "regex:"
followed by a regular expression:
// Jovo model { "invocation": "my test app", "version": "4.0", "intents": { "SsnIntent": { "phrases": [ "my tax id is {ssn}" ], "entities": { "ssn": { "type": "regex:/(?!000)(?!666)(?!9)[0-9]{3}[ -]?(?!00)[0-9]{2}[ -]?(?!0000)[0-9]{4}/g" } } } } }
Builtin entities
When defining the Jovo Model for builtin entities, you don't need to include the entities
object.
// Jovo model { "invocation": "my test app", "version": "4.0", "intents": { "NumberIntent": { "phrases": [ "the number is {number}", "pick a number between {number_0} and {number_1}" ] } } }
The NLP.js integration picks the names of the variables when an entity is matched. It doesn't matter what name you use in the utterance. For consistency, see the list of names below found in the allowList
array.
Builtin libraries
The NLP.js library allows for various builtin entity libraries to be configured. These libraries include popular regex matchers (recognizers) and post processing for numbers, dates and more.
Microsoft Builtin
To configure the Microsoft Builtin library, you must use the plug-in configuration setupModelCallback
function:
// app.ts import { Nlp, NlpjsNlu } from '@jovotech/nlu-nlpjs'; import { LangEn } from '@nlpjs/lang-en'; import { Platform } from '@jovotech/framework'; import { BuiltinMicrosoft } from '@nlpjs/builtin-microsoft'; const nlp = new NlpjsNlu({ languageMap: { en: LangEn }, enabled: true, setupModelCallback: async (platform: Platform, nlpjs: Nlp) => { // configure builtin const builtin = new BuiltinMicrosoft({ tag: 'builtin-microsoft', builtins: [ 'Age', 'Boolean', 'Currency', 'DateTime', 'Dimension', 'Email', 'Hashtag', 'IpAddress', 'Mention', 'Number', 'Ordinal', 'Percentage', 'PhoneNumber', 'Temperature', 'URL', ], allowList: [ 'age', 'boolean', 'currency', 'date', 'daterange', 'datetime', 'datetimealt', 'datetimerange', 'dimension', 'duration', 'email', 'hashtag', 'ip', 'mention', 'number', 'numberrange', 'ordinal', 'percentage', 'phonenumber', 'set', 'temperature', 'time', 'timerange', 'timezone', 'url', ], }); // register builtin nlpjs.container.register('extract-builtin-??', builtin, true); // eslint-disable-next-line @typescript-eslint/ban-ts-comment // @ts-ignore // convert Jovo Model to NLP.js corpus and add await nlp.addCorpusFromModelsIn('./models'); // eslint-disable-next-line @typescript-eslint/ban-ts-comment // @ts-ignore // train corpus await nlpjs.train(); }, }); app.configure({ plugins: [ new JovoDebugger({ nlu: nlp, // overwrite default NLP.js config }), ], });
NOTE:
To get TypeScript to compile with the Microsoft Builtin library, you may need to add a custom type file and update tsconfig.json
:
// types/custom.d.ts declare module '@nlpjs/builtin-microsoft';
// tsconfig.json { "compilerOptions": { "typeRoots": ["./node_modules/@types", "./types"] }, "include": [ "src", "types" ], }
The constructor for the BuiltinMicrosoft
object accepts a configuration. An empty config gets the default configuration. The tag must be "builtin-microsoft"
. The builtins
array is used to determine which recognizers are called. The allowList
array determines which entities appear in the result. You can tell by the names used in each list which recognizers go with which entity names.
NLP.js will call every recognizer listed for every input without regard to the intent matched. Whenever the regex in a recognizer matches a string pattern, the entity is added to the $entities
property. Only include the recognizers that you want to process in the builtins
array and confirm that the corresponding entity names are listed in the allowList
.
Combined builtin and custom entities
Here is an example of using an enum entity with a builtin entity:
// Jovo model { "invocation": "my test app", "version": "4.0", "intents": { "PetIntent": { "phrases": [ "i have a {pet}", "i want {number} {pet}" ], "entities": { "pet": { "type": { "nlpjs": "CUSTOM_PET" } } } } }, "entityTypes": { "CUSTOM_PET": { "name": "CUSTOM_PET", "values": [ { "value": "dog", "synonyms": ["dogs"] }, { "value": "cat", "synonyms": ["cats"] } ] }, } }
Limitations
Same type different name
NLP.js does not support multiple entities of the same type in the same phrase, for example fly from {fromCity} to {toCity}
. In that example, it is possible that the input fly from Berlin to Amsterdam
is mapped to two toCity
entities.
If you need to use a pattern like this, we recommend taking a look at other NLU integrations like Snips NLU.
Multiple Matches
For enum, regex and builtin entities if there are multiple matches, the entities are named with a suffix _0
, _1
, etc.
Examples:
- "pick red and blue" (enum): color_0, color_1
- "they are 111-22-3333 and 222-33-4444" (regex): ssn_0, ssn_1
- "from 1 to 10" (builtin): number_0, number_1
Matches are always left to right.
Aggressive entity matching
Another limitation of NLP.js is that the intent matching is separate from the processing of entities. This means that you could end up with entities for an intent that doesn't define it. This applies to enum, regex and builtin entities.
For example, here is an intent definition that includes no entities:
// Jovo model { "invocation": "my test app", "version": "4.0", "intents": { "YesIntent": { "phrases": [ "yes", "yes please", "sure" ] } } }
The plug-in is configured to use builtin entities and also defines a CUSTOM_COLOR enum.
When the input is "yes red please"
the NLU response includes entities:
// $input.nlu { "intent": { "name": "YesIntent" }, "entities": { "color": { "id": "red", "resolved": "red", "value": "red", "native": { "start": 4, "end": 6, "len": 3, "levenshtein": 0, "accuracy": 1, "entity": "color", "type": "enum", "option": "red", "sourceText": "red", "utteranceText": "red" } }, "boolean": { "value": "yes", "native": { "start": 0, "end": 2, "len": 3, "accuracy": 0.95, "sourceText": "yes", "utteranceText": "yes", "entity": "boolean", "rawEntity": "boolean", "resolution": { "value": true, "score": 0.6, "otherResults": [] } } } } }
Trim entities
NLP.js allows for the definition of trim rules for entity matching. This is not supported by the Jovo Model v4.0 format.
Learn more about Trim entities here.
To get around this Jovo 4 model limitation, you can train an NLP.js corpus file directly:
// app.ts import { Nlp, NlpjsNlu } from '@jovotech/nlu-nlpjs'; import { LangEn } from '@nlpjs/lang-en'; import { Platform } from '@jovotech/framework'; import corpus from './nlpjs-corpus-en.json'; const nlp = new NlpjsNlu({ languageMap: { en: LangEn }, enabled: true, setupModelCallback: async (platform: Platform, nlpjs: Nlp) => { // eslint-disable-next-line @typescript-eslint/ban-ts-comment // @ts-ignore // await nlp.addCorpusFromModelsIn('./models'); nlpjs.addCorpus(corpus[0].content); // eslint-disable-next-line @typescript-eslint/ban-ts-comment // @ts-ignore await nlpjs.train(); }, }); app.configure({ plugins: [ new JovoDebugger({ nlu: nlp, // overwrite default NLP.js config }), ], });