jsonPlug

Overview

jsonplug is a post-processor that leverages the JsonPath query language to make simple changes to the output object.
With this module it is possible to delete, add and modify categories and extracted records.

The module depends on the jsonpath module which must therefore be installed too.

When in Studio you install the jsonplug module in your project, Studio modifies the main.jr file to insert this statement at the beginning of the file:

var jsonPlug = require('modules/jsonPlug');

The statement above sets a variable with an instance of the module so that you can use it in all event handling functions.

This module has to be invoked in the onFinalize function, where the results object is available. It only has one method, called jsonplug, whose syntax is:

moduleVariable.jsonPlug(result, action, jsPathConditionFlag, jsPath, jsPathAction, recursive, values, skipNameValidation)

where:

moduleVariable is the variable corresponding to the module and set with require().
result is the object containing the analysis results.
action is the action to perform on the results object. The available values of this parameter are:
- delete
- delete template
- delete record
- add field
- add template
- add record
- clone
- clone new
- clone value
- clone instances
- add category
- modify
- modify regex
- apply math
Actions are described below.
jsPathConditionFlag is either a boolean or a string. In the first case, it can be:
- true: the action is applied only if the jspath expression selects one or more nodes.
- false: the action is applied only if the jspath expression doesn't select any nodes.
In the second case, it is a string having the following format:
```
count operator integer
```
where:
- operator can be:
  - >
  - <
  - <=
  - >=
  - =
  - ==
- integer is a non-negative integer number.
Or, it can be a string with the following format:
```
multiple
```
where multiple is used to process multiple JSONPaths—each one needing a true/false boolean—in order to validate the condition to perform an action (see the example in modify).

jsPathConditionFlag is a condition relating to the number of nodes—the count—selected by jspath. The action is applied if the condition is verified.

For example, if jsPathConditionFlag is:
```
count > 3
```
it means that jsPath must identify at least three nodes for the action to be applied.
jsPath is the JSONPath expression that, in combination with jsPathAction (except for the case when also jsPathAction is a full JSONPath expression), determines the nodes to which the action must be applied.

Info

Several useful resources are available on the Web to get more information about JSONPath syntax and to test JSONPath expressions.
jsPathAction is the JSONPath expression that, alone or in combination with jsPath , determines where the action must be applied. Its value can be:
- #this# or .: the action is applied to the nodes selected by jspath.
- ^: the action is applied to the parent nodes of the nodes selected by jspath.
- ^ relativepath: the action is applied to the sibling nodes of the nodes selected by jspath that match relativepath.
- a full JSONPath: the action is applied to the nodes selected by jspathaction if jspathcondition is satisfied.
recursive is a boolean indicating if the action has to be applied to the first or to all of the nodes selected by jsPath and/or jsPathAction.
values is an array containing the specification of the action. The meaning of the array items and their order vary according to the action.
skipNameValidation doesn't apply the template and field validation if set to true. This optional flag is used for modify and modify regex.

Note

For modify, values can also be a string.

Actions

The following paragraphs will describe the possible values of the action parameter. The value of the values parameter is interpreted based upon the value of action.

delete

Use delete to delete extraction fields, a record or a category.

For example, consider this template:

TEMPLATE(PERSONAL_DATA)
{
    @Name,
    @Date_of_birth,
    @Phone_number,
    @Address,
    @Job,
    @Type_of_job
}

If this rule:

SCOPE SENTENCE
{
    IDENTIFY(PERSONAL_DATA)
    {
        @Name[TYPE(NPH)]
        <>
        @Phone_number[TYPE(PHO)]

    }
}

is applied to the following input text:

Stephen King's number is 0000000000.

you will get this output:

With this code:

function onFinalize(result) {
    jsonPlug.jsonPlug(result, "delete", true, "$..extraction[?(@.template == 'PERSONAL_DATA')]..fields[?(@.field == 'Name')]", "#this#", true, "");
    return result;
}

you will get:

As you can see, the Name field belonging to the PERSONAL_DATA record was deleted.

The values parameter is ignored by this action and can be left empty.

delete template and delete record

Use delete template or delete record to delete whole records.

For example, consider this template:

TEMPLATE(PERSONAL_DATA)
{
    @Name,
    @Date_of_birth,
    @Phone_number,
    @Address,
    @Job,
    @Type_of_job
}

If the rule used for delete is applied to the same input text as above, with this code:

function onFinalize(result) {
    jsonPlug.jsonPlug(result, "delete template", true, "$..extraction[?(@.template == 'PERSONAL_DATA')]..fields[?(@.field == 'Name')]", "#this#", true, "");
    return result;
}

or with this code:

function onFinalize(result) {
    jsonPlug.jsonPlug(result, "delete record", true, "$..extraction[?(@.template == 'PERSONAL_DATA')]..fields[?(@.field == 'Name')]", "#this#", true, "");
    return result;
}

all the PERSONAL_DATA records containing a Name field are deleted.

The values parameter is ignored by this action and can be left empty.

add field

Use add field to add a field to a record.

For example, consider this template:

TEMPLATE(PERSONAL_DATA)
{
    @Name,
    @Age,
    @Date_of_birth,
    @Phone_number,
    @Address,
    @Job,
    @Type_of_job
}

If this rule:

SCOPE SENTENCE
{
    IDENTIFY(PERSONAL_DATA)
    {
        @Name[TYPE(NPH)]
        <>
        @Job[LEMMA("writer")]
        <1:3>
        LEMMA("comic")
    }
}

is applied to the following input text:

Alan Moore is considered as the best writer of comics ever.

you will get:

With this code:

function onFinalize(result) {
    jsonPlug.jsonPlug(result, "add field", true, "$..extraction[?(@.template == 'PERSONAL_DATA')]..fields[?(@.field == 'Name')]", "#this#", true, ["Age", "68"]);
    return result;
}

you will get:

As you can see, the Age field with a value of 68 was added to the PERSONAL_DATA record.

The contents of the values array must be:

fieldName, fieldValue

where:

fieldName is the field name.
fieldValue is the field value.

add template and add record

Use add template or add record to create a new record of an existing template.

For example, consider these templates:

TEMPLATE(PERSONAL_DATA)
{
    @Name,
    @Date_of_birth,
    @Phone_number,
    @Address,
    @Age,
    @Job
}

TEMPLATE(COMPANY)
{
    @Location,
    @Name
}

If the following rule:

SCOPE SENTENCE
{
    IDENTIFY(PERSONAL_DATA)
    {
        @Name[TYPE(NPH)]
        <>
        @Job[LEMMA("knowledge engineer")]
    }
}

is applied to this input text:

Jonathan works as a knowledge engineer.

you will get:

With this code:

function onFinalize(result) {
    jsonPlug.jsonPlug(result, "add template", true, "$..extraction[?(@.template == 'PERSONAL_DATA')]..fields[?(@.field == 'Name')]", "#this#", true, ["COMPANY", "Location", "Rovereto", "Name", "Expert.ai"]);
    return result;
}

or with this one:

function onFinalize(result) {
    jsonPlug.jsonPlug(result, "add record", true, "$..extraction[?(@.template == 'PERSONAL_DATA')]..fields[?(@.field == 'Name')]", "#this#", true, ["COMPANY", "Location", "Rovereto", "Name", "Expert.ai"]);
    return result;
}

you will get:

As you can see, a new COMPANY record was added, having the Location field set to Rovereto and the Name field set to Expert.ai.

The contents of the values array must be:

templateName, field1Name, field1Value [, field2Name, field2Value [, ... fieldnName, fieldnValue]]

where:

templateName is the template name.
field#Name is the name of a field.
field#Value is the value of that field.

The first three items are mandatory in order to define at least one field. The other items, when present, have to be added in couples to define additional fields and their values.

Note

If jsPathConditionFlag is set to false, jsPathAction must be $..extraction.

clone

Use clone to clone extraction fields and optionally modify the cloned values.

For example, consider this template:

TEMPLATE(PERSONAL_DATA)
{
    @Name,
    @Date_of_birth,
    @Phone_number,
    @Address,
    @Age,
    @Job,
    @Type_of_job
}

If the following rule:

SCOPE SENTENCE
{
    IDENTIFY(PERSONAL_DATA)
    {
        @Name[TYPE(NPH)]
        <>
        @Job[LEMMA("software engineer")]
    }
}

is applied to this input text:

Jane works as a software engineer.

you will get:

With this code:

function onFinalize(result) {
    jsonPlug.jsonPlug(result, "clone", true, "$..extraction[?(@.template == 'PERSONAL_DATA')]..fields[?(@.field == 'Job')]", "#this#", true, ["Type_of_job", "no tokens"]);
    return result;
}

you will get:

As you can see, the the Job field was cloned into the Type_of_job field.

In a common use case, the new field is then processed to change its value. You can see an example of this in the description of the modify action.

The contents of the values array must be:

fieldName, cloneOption[, regularExpression, replacementString]

Note

The parts in square brackets are optional.

where:

fieldName is either the new field name or an empty string meaning that the cloned filed will have the name of the source field.
cloneOption can be:
- no tokens: the new field will have no references to the rule(s) that determined the extraction and to the text that triggered the rule(s).
- clone from source or an empty string: the new field has the same information—in terms of triggered rules and triggering text—of the source field.
- clone from sibling: for cases in which jspathaction is used to select sibling nodes, the new field has the same information, in terms of triggered rules and triggering text, of the field selected by jspathaction.
regularExpression is the regular expression that determines the parts of the node value to change where placeholders like $1, $2, etc. can be used to refer to capturing group.
replacementString is the replacement string.

clone new

Use clone new to create a record of a predefined template containing a clone of an existing field.

For example, consider these templates:

TEMPLATE(ATHLETES)
{
    @Name,
    @Sport_discipline
}

TEMPLATE(OLYMPIC_CHAMPIONS)
{
    @Proper_name
}

If this rule:

SCOPE SENTENCE
{
    IDENTIFY(ATHLETES)
    {
        @Name[TYPE(NPH)]
        <>
        @Sport_discipline[LEMMA("swimmer")]
    }
}

is applied to the following input text:

Federica Pellegrini is one of the best swimmers of all time.

you will get:

With this code:

function onFinalize(result) {
    jsonPlug.jsonPlug(result, "clone new", true, "$..extraction[?(@.template == 'ATHLETES')]..fields[?(@.field == 'Name')]", "#this#", true, ["OLYMPIC_CHAMPIONS", "Proper_name", "no tokens"]);
    return result;
}

you will get:

As you can see, a new OLYMPIC_CHAMPIONS record was created with the Proper_name field having the value of the field identified by the jspath expression.

The contents of the values array must be:

templateName, fieldName, cloneOption

where:

templateName is the new record template name.
fieldName is either the new field name or an empty string meaning that the cloned filed will have the name of the source field.
cloneOption can be:
- no tokens: the new field will have no references to the rule(s) that determined the extraction and to the text that triggered the rule(s).
- clone from source or an empty string: the new field has the same information—in terms of triggered rules and triggering text—of the source field.
- clone from sibling: for cases in which jspathaction is used to select sibling nodes, the new field has the same information, in terms of triggered rules and triggering text, of the field selected by jspathaction.

clone value

Use clone value to clone and/or modify an extracted value into another predefined field.

For example, consider this template:

TEMPLATE(PERSONAL_DATA)
{
    @NAME,
    @AGE,
    @ADDRESS,
    @NICKNAME
}

If these rules:

SCOPE SENTENCE
{
    IDENTIFY(PERSONAL_DATA)
    {
        @NAME[TYPE(NPH)]
    }

    IDENTIFY(PERSONAL_DATA)
    {
        @NICKNAME[TYPE(NPH)]
    }
}

are applied to this input text:

Hello Alan.

you will get:

With this code:

function onFinalize(result) {
     jsonPlug.jsonPlug(result, "clone value", true ,
     "$..extraction[?(@.template == 'PERSONAL_DATA')].fields[?(@.field == 'NAME')]",
     "$..extraction[?(@.template == 'PERSONAL_DATA')].fields[?(@.field == 'NICKNAME')]",
      true, [true, /^(.+)$/, "The bard of Northampton"]);
    return result;
}

you will get:

As you can see, the value of the field NAME has been turned into The bard of Northampton and moved to the field NICKNAME.

The contents of the values array in case of modification of the cloned value must be:

replaceFlag, regularExpression, replacementString

where:

replaceFlag is a boolean with the value of true allowing you to apply a regular expression to modify the extracted value.
regularExpression is the regular expression that determines the parts of the value to change.
replacementString is the replacement string where placeholders like $1, $2, etc. can be used to refer to the capturing groups of the regular expression.

In case of value clonation, the values array must be left empty.

clone instances

Use clone instances to replace the normalized field values with the extracted textual values.

This method can be very useful when used in combination with tagging and/or transformation.

For example, consider this template and tag:

TEMPLATE(PERSONAL_DATA)
{
    @Name,
    @Age,
    @Address,
    @Job_type
}

TAGS
{
    @TAG1
}

If these rules:

SCOPE SENTENCE
{
      TAGGER()
    {
        @TAG1[LEMMA("developer", "software developer")]
    }

     IDENTIFY(PERSONAL_DATA)
    {
        @Job_type[TAG(TAG1)]|[TAG]
    }
}

are applied to this input text:

Marco is a developer and Jonathan and Mary are also software developers.

you will get:

With this code:

   function onFinalize(result) {
jsonPlug.jsonPlug(result, "clone instances", true, "$..extraction[?(@.template == 'PERSONAL_DATA')]..fields[?(@.field == 'Job_type')].value", "#this#", true, ["longest instance", true, /^((software) developer(s)?)$/gi, "$2 dev$3."]);
    return result;
}

you will get:

As you can see, the longest textual value software developers was extracted and turned into software devs.

The contents of the values array must be:

instanceType, replaceFlag, regularExpression, replacementString

where:

instanceType is a flag that establishes which text value will be copied. It can be:
- all instances: clone all text instances separated by a pipe character (|).
- longest instance: clone the first longest instance of the text values.
- first instance: clone the first instance of the text values.
replaceFlag is a boolean, it can be:
- false: clone the text values as they are.
- true: apply a regular expression and the replacement string.
regularExpression is the regular expression that determines the parts of the value to change.
replacementString is the replacement string where placeholders like $1, $2, etc. can be used to refer to the capturing groups of the regular expression.

Note

The last two parameters must be inserted if replaceFlag is set to true.

add category

Use add category to add a new category.

For example, consider this taxonomy:

1   Animals 
    1.1 Cats
    1.2 Dogs

and this template:

TEMPLATE(DOGS_BREED)
{
    @Name
}

If this rule:

SCOPE SENTENCE
{
    IDENTIFY(DOGS_BREED)
    {
        @Name[ANCESTOR(100000144)] //@SYN: #100000144# [dog]
    }
}

is applied to this input text:

Rex is a beautiful 10 year old German Shepherd, and his help was extremely important for the Police.

you will get:

With this code:

function onFinalize(result) {
    jsonPlug.jsonPlug(result, "add category", true, "$..extraction[?(@.template == 'DOGS_BREED')]", "#this#", true, ["1.2", "Dogs", 10, 100.0]);
    return result;
}

you will get:

Categorization	Extraction

As you can see, the 1.2 category with the Dogs label was added with a category score and compound score equal to 10 and a frequency equal to 100.0%.

Parameters jsPathAction and recursive are ignored.

The contents of the values array must be:

categoryName, categoryLabel, scoreAndCompound, categoryFrequency

where:

categoryName is the category name.
categoryLabel is the category label.
scoreAndCompound is a non negative integer number used for both the category score and the compound score.
categoryFrequency is a non negative decimal number used for the category frequency.

modify

Use modify to change:

The category name
The category label
The record template name
The field name
The field value

For example, consider these templates differring only in the name:

TEMPLATE(PERSONAL_DATA)
{
    @Name,
    @Date_of_birth,
    @Geographical_location,
    @Main_works
}

TEMPLATE(COMIC_WRITERS)
{
    @Name,
    @Date_of_birth,
    @Geographical_location,
    @Main_works
}

If the following rule:

SCOPE SENTENCE
{
    IDENTIFY(PERSONAL_DATA)
    {
        @Name[TYPE(NPH)]
        <>
        @Date_of_birth[TYPE(DAT)]
        <>
        @Geographical_location[SYNCON(100192240)]  //@SYN: #100192240# [Northampton]
    }
}

is applied to this input text:

Mr. Alan Moore was born on the 18th of November 1953 in Northampton.

you will get:

With this code:

function onFinalize(result) {
    jsonPlug.jsonPlug(result, "modify", true, "$..extraction[?(@.template == 'PERSONAL_DATA')].template", "#this#", true, ["COMIC_WRITERS"]);
    return result;
}

or with this one:

function onFinalize(result) {
    jsonPlug.jsonPlug(result, "modify", true, "$..extraction[?(@.template == 'PERSONAL_DATA')].template", "#this#", true, "COMIC_WRITERS");
    return result;
}

you will get:

As you can see, the record template name has changed from PERSONAL_DATA to COMIC_WRITERS.

values must contain one item that is the new value for the selected nodes. values can be an array or a string.

The same output can also be obtained with the following code using the multiple string as jsPathConditionFlag:

function onFinalize(result) {
    jsonPlug.jsonPlug(result, "modify", "multiple",
        [
            false, "$..extraction[?(@.template == 'COMIC_WRITERS')]",
            true, "$..extraction[?(@.template == 'PERSONAL_DATA')]",
            true, ".fields[?(@.field == 'Date_of_birth' && @.value == 'Nov-18-1953')]",
            true, "^.template"
        ], "#this#", true, ["COMIC_WRITERS"], true)
    return result;
}

With this example code, the record template name will be modified only if:

There is not an initial record named COMIC_WRITERS.
There is an initial record named PERSONAL_DATA.
There is the Date_of_birth field with the value—at its base form—Nov-18-1953.

Note

If these conditions are satisfied, ^.template goes back to the template level modifying its name.
The final true will determine the final match of the whole array.

modify regex

Like modify with the only difference that a regular expression is used to determine which parts of the value have to be replaced.

For example, consider the following templates:

TEMPLATE(PERSONAL_DATA)
{
    @Name,
    @Date_of_birth,
    @Geographical_location,
    @Main_works
}

TEMPLATE(PERSONAL_DATA_TEST)
{
    @Name,
    @Date_of_birth,
    @Geographical_location,
    @Main_works
}

If this rule:

SCOPE SENTENCE
{
    IDENTIFY(PERSONAL_DATA)
    {
        @Name[TYPE(NPH)]
        <>
        @Date_of_birth[TYPE(DAT)]
        <>
        @Geographical_location[SYNCON(100192240)]  //@SYN: #100192240# [Northampton]
    }
}

is applied to this input text:

Mr. Alan Moore was born on the 18th of November 1953 in Northampton.

this code:

function onFinalize(result) {
    jsonPlug.jsonPlug(result, "modify regex", true, "$..extraction[?(@.template == 'PERSONAL_DATA')].template", "#this#", true, [/^(.+)$/, "$1_TEST"]);
    return result;
}

will produce the following change:

No code	Code

As you can see, the record name has changed. The new one is a concatenation of the first regular expression capturing group plus _TEST.

The contents of the values array must be:

regularExpression, replacementString

where:

regularExpression is the regular expression that determines the parts of the node value to change.
replacementString is the replacement string where placeholders like $1, $2, etc. can be used to refer to the capturing groups of the regular expression.

apply math

Use apply math to apply mathematical operations to the extracted fields.

For example, consider this template:

TEMPLATE(TEST)
{
    @TOTAL_MONEY,
    @VALUE_TO_SUBTRACT
}

If these rules:

SCOPE SENTENCE
{
    IDENTIFY(TEST)
    {
        !LEMMA("tax")
        <1:2>
        @TOTAL_MONEY[TYPE(MON)]
    }

    IDENTIFY(TEST)
    {
        LEMMA("tax")
        <1:2>
        @VALUE_TO_SUBTRACT[TYPE(MON)]
    }
}

are applied to this input text:

On a total amount of 40000€, your taxes are 5000€

you will get:

With this code:

function onFinalize(result) {
    jsonPlug.jsonPlug(result, "apply math", true ,
    "$..extraction[?(@.template == 'TEST')].fields[?(@.field == 'VALUE_TO_SUBTRACT')].value",
    "$..extraction[?(@.template == 'TEST')].fields[?(@.field == 'TOTAL_MONEY')].value", true, ["subtract", "jspath", true, "."]);
    return result;
}

you will get:

As you can see, the value of the field VALUE_TO_SUBTRACT was subtracted from the value of the field TOTAL_MONEY.

The contents of the values array must be:

mathematicalOperation, dynamicValue, removalFlag, separator, rounder

where:

mathematicalOperation is the mathematical operation to apply (case insensitive). It can be:
- Add: adds the value to the matched jsPathAction.
- Multiply: multiplies the value by the matched jsPathAction.
- Subtract: subtracts the value from the matched jsPathAction.
- Divide: divides the value from the matched jsPathAction
- Swap divide: divides the matched jsPathAction from the value.
- Swap subtract: subtracts the matched jsPathAction from the value.
- Calculate % from digit: calculates the percentage of the static/JSONPath value compared to the jsPathAction value.
- Calculate digit from %: calculates to which percentage the jsPathAction value corresponds compared to the static/JSONPath value.
dynamicValue is either a static numerical value or a special JSONPath keyword in case the value is a variable to be taken from the matched JSONPath.
removalFlag is a boolean value, mandatory if you have non-numerical characters like currencies, otherwise optional. If set to true, all non-numerical characters will be removed when parsing the values.
separator is the optional thousand separator. It can be left empty (no separator added) or can be:
- .
- ,
- none (no separator added)
rounder is the optional number of decimals after which rounding is applied, it can be left empty. If 0, the number is rounded to its closest integer.