clone actions

Introduction

Here follows a list of the following actions described in the page:

clone
clone new
clone new multiple
clone value
clone template
clone record
clone instances

Note

In case of clonations based on instances with different confidence scores, the field confidence score of such clonations is calculated with the same formula used for the confidence of instances. This happens with all actions except for clone instances.

clone

Use clone to clone extraction fields and optionally modify the cloned values.

For example, consider this template:

TEMPLATE(PERSONAL_DATA)
{
    @Name,
    @Date_of_birth,
    @Phone_number,
    @Address,
    @Age,
    @Job,
    @Type_of_job
}

If the following rule:

SCOPE SENTENCE
{
    IDENTIFY(PERSONAL_DATA)
    {
        @Name[TYPE(NPH)]
        <>
        @Job[LEMMA("software engineer")]
    }
}

is applied to this input text:

Jane works as a software engineer.

you will get this record:

Template: PERSONAL_DATA

Field	Value
@Name	Jane
@Job	software engineer

With this code:

function onFinalize(result) {
    jsonPlug.jsonPlug(result, {
        action: "clone",
        jsPathConditionFlag: true,
        jsPath: "$.match_info.rules.extraction[?(@.template == 'PERSONAL_DATA')].fields[?(@.field == 'Job')]",
        jsPathAction: "#this#",
        values: ["Type_of_job", "no tokens", "SCRIPT('toUpper')"]
    });
    return result;
}

you will get this record:

Template: PERSONAL_DATA

Field	Value
@Type_of_job	SOFTWARE ENGINEER
@Name	Jane
@Job	software engineer

As you can see, the Job field was cloned into the Type_of_job field whose value has been converted into uppercase thanks to the toUpper built-in function.

As a different example, consider the same template, but these rules:

SCOPE SENTENCE
{
    IDENTIFY(PERSONAL_DATA)
    {
        @Name[TYPE(NPH)]
        <>
        @Job[LEMMA("software engineer")]
    }


     IDENTIFY(PERSONAL_DATA)
    {
        @Name[TYPE(NPH)]
        <>
        @Job[KEYWORD("knowledge engineer")]
    }


    IDENTIFY(PERSONAL_DATA)
    {
        @Name[TYPE(NPH)]
    }
}

If they are applied to this input text:

Jane works as a software engineer. 
Mary is a knowledge engineer.
James is qualified for both jobs.

you will get these records:

Template: PERSONAL_DATA

Field	Value
@Name	Jane
@Job	software engineer

Template: PERSONAL_DATA

Field	Value
@Name	Mary
@Job	knowledge engineer

Template: PERSONAL_DATA

Field	Value
@Name	James

With this code:

function onFinalize(result) {
     jsonPlug.jsonPlug(result, {
        action: "clone",
        jsPathConditionFlag: true,
        jsPath: "$.match_info.rules.extraction[?(@.template == 'PERSONAL_DATA')].fields[?(@.field == 'Job')]",
        jsPathAction: [
                true, "$.match_info.rules.extraction[?(@.template == 'PERSONAL_DATA')]",
                false, ".fields[?(@.field == 'Job')]"
        ],
        values: ["Type_of_job", "no tokens"]
      });
    return result;
}

you will get these records:

Template: PERSONAL_DATA

Field	Value
@Name	Jane
@Job	software engineer

Template: PERSONAL_DATA

Field	Value
@Name	Mary
@Job	knowledge engineer

Template: PERSONAL_DATA

Field	Value
@Type_of_job	software engineer
@Type_of_job	knowledge engineer
@Name	James

As you can see, both values of the Job field were cloned into the Type_of_job field in the record that initially had the Name field only.

In a common use case, the new field is then processed to change its value. You can see an example of this in the description of the modify action.

The contents of the values array must be:

fieldName, cloneOption[, regularExpression, replacementString]

or:

fieldName, cloneOption[, replacementString]

or:

fieldName, cloneOption[, scriptingFunctions]

Note

The parts in square brackets are optional.

where:

fieldName is either the new field name or an empty string meaning that the cloned filed will have the name of the source field.
cloneOption can be:
- no tokens: the new field will have no references to the rule(s) that determined the extraction and to the text that triggered the rule(s).
- clone from source or an empty string: the new field has the same information—in terms of triggered rules and triggering text—of the source field.
- clone from sibling: for cases in which jspathaction is used to select sibling nodes, the new field has the same information, in terms of triggered rules and triggering text, of the field selected by jspathaction.
regularExpression is the regular expression that determines the parts of the node value to change.
replacementString is the replacement string where placeholders like $1, $2, etc. can be used to refer to the capturing groups of the regular expression.
scriptingFunctions corresponds to the scripting functions with eventual parameters called with the same syntax of the SCRIPT attribute to further modify the output.

Note

It is not allowed to use regular expressions in combination with scripting functions. However, it is possible to use regular expressions within a scripting function.

Note

If you clone a field into a record which already contains a field with an identical name and the same extracted value, both will be merged into a single one inheriting all their instances.

clone new

Use clone new to create a single record of a predefined template containing a clone of an existing field.

For example, consider these templates:

TEMPLATE(ATHLETES)
{
    @Name,
    @Sport_discipline
}

TEMPLATE(OLYMPIC_CHAMPIONS)
{
    @Proper_name
}

If this rule:

SCOPE SENTENCE
{
    IDENTIFY(ATHLETES)
    {
        @Name[TYPE(NPH)]
        <>
        @Sport_discipline[LEMMA("swimmer")]
    }
}

is applied to the following input text:

Federica Pellegrini is one of the best swimmers of all time.

you will get this record:

Template: ATHLETES

Field	Value
@Sport_discipline	swimmer
@Name	Federica Pellegrini

With this code:

function onFinalize(result) {
    jsonPlug.jsonPlug(result, {
        action: "clone new",
        jsPathConditionFlag: true,
        jsPath: "$.match_info.rules.extraction[?(@.template == 'ATHLETES')].fields[?(@.field == 'Name')]",
        jsPathAction: "#this#",
        values: ["OLYMPIC_CHAMPIONS", "Proper_name", "no tokens", "SCRIPT('toLower')"]
    });
    return result;
}

you will get these records:

Template: ATHLETES

Field	Value
@Sport_discipline	swimmer
@Name	Federica Pellegrini

Template: OLYMPIC_CHAMPIONS

Field	Value
@Proper_name	federica pellegrini

As you can see, a new OLYMPIC_CHAMPIONS record was created with the Proper_name field having the value of the field identified by the jspath expression. The field value has been converted into lowercase thanks to the toLower built-in function.

The contents of the values array must be:

templateName, fieldName, cloneOption[, scriptingFunctions]

where:

templateName is the new record template name.
fieldName is either the new field name or an empty string meaning that the cloned filed will have the name of the source field.
cloneOption can be:
- no tokens: the new field will have no references to the rule(s) that determined the extraction and to the text that triggered the rule(s).
- clone from source or an empty string: the new field has the same information—in terms of triggered rules and triggering text—of the source field.
- clone from sibling: for cases in which jspathaction is used to select sibling nodes, the new field has the same information, in terms of triggered rules and triggering text, of the field selected by jspathaction.
scriptingFunctions corresponds to the scripting functions with eventual parameters called with the same syntax of the SCRIPT attribute to further modify the output.

clone new multiple

Like clone new, with the difference that different records are created for each value of the field to be cloned.

For example, consider these templates:

TEMPLATE(ATHLETES)
{
    @Name,
    @Sport_discipline
}

TEMPLATE(OLYMPIC_CHAMPIONS)
{
    @Proper_name
}

If this rule:

SCOPE SENTENCE
{
    IDENTIFY(ATHLETES)
    {
        @Name[TYPE(NPH)]
        <>
        @Sport_discipline[LEMMA("swimmer")]
    }
}

is applied to this text:

Federica Pellegrini is one of the best swimmers of all time.
Michael Phelps is also one of the best swimmers of all time.

you will get these records:

Template: ATHLETES

Field	Value
@Sport_discipline	swimmer
@Name	Federica Pellegrini

Template: ATHLETES

Field	Value
@Sport_discipline	swimmer
@Name	Michael Phelps

With each one of these codes:

function onFinalize(result) {
    jsonPlug.jsonPlug(result, {
        action: "clone new",
        jsPathConditionFlag: true,
        jsPath: "$.match_info.rules.extraction[?(@.template == 'ATHLETES')].fields[?(@.field == 'Name')]",
        jsPathAction: "#this#",
        values: ["OLYMPIC_CHAMPIONS", "Proper_name", "no tokens"]
    });
    return result;
}

Or:

function onFinalize(result) {
    jsonPlug.jsonPlug(result, {
        action: "clone new multiple",
        jsPathConditionFlag: true,
        jsPath: "$.match_info.rules.extraction[?(@.template == 'ATHLETES')].fields[?(@.field == 'Name')]",
        jsPathAction: "#this#",
        values: ["OLYMPIC_CHAMPIONS", "Proper_name", "no tokens"]
    });
    return result;
}

you will get these records with the first one:

Template: ATHLETES

Field	Value
@Sport_discipline	swimmer
@Name	Federica Pellegrini

Template: ATHLETES

Field	Value
@Sport_discipline	swimmer
@Name	Michael Phelps

Template: OLYMPIC_CHAMPIONS

Field	Value
@Proper_name	Federica Pellegrini
@Proper_name	Michael Phelps

and these records with the second one:

Template: ATHLETES

Field	Value
@Sport_discipline	swimmer
@Name	Federica Pellegrini

Template: ATHLETES

Field	Value
@Sport_discipline	swimmer
@Name	Michael Phelps

Template: OLYMPIC_CHAMPIONS

Field	Value
@Proper_name	Federica Pellegrini

Template: OLYMPIC_CHAMPIONS

Field	Value
@Proper_name	Michael Phelps

While a single OLYMPIC_CHAMPIONS record was created for both field values with clone new, two different records were created with clone new multiple to separate both values.

The contents of the values array are the same described in clone new.

clone template/clone record

Use clone template or clone record (both are equivalent) to duplicate a record with a different template name. Unlike from the regular clone, this action will replicate all the fields present within the matched record(s).

For example, consider these templates:

TEMPLATE(ATHLETES)
{
    @Name,
    @Sport_discipline
}

TEMPLATE(ATHLETES_CLONED)
{
    @Name,
    @Sport_discipline
}

If this rule:

SCOPE SENTENCE
{
    IDENTIFY(ATHLETES)
    {
        @Name[TYPE(NPH)]
        <>
        @Sport_discipline[LEMMA("swimmer")]
    }
}

is applied to the following input text:

Federica Pellegrini is one of the best swimmers of all time.

you will get this record:

Template: ATHLETES

Field	Value
@Sport_discipline	swimmer
@Name	Federica Pellegrini

With this code:

function onFinalize(result) {
    jsonPlug.jsonPlug(result, {
        action: "clone record",
        jsPathConditionFlag: true,
        jsPath: "$.match_info.rules.extraction[?(@.template == 'ATHLETES')].fields[?(@.field == 'Name' && @.value == 'Federica Pellegrini')]",
        jsPathAction: "#this#",
        values: ["ATHLETES_CLONED"]
    });
    return result;
}

you will get these records:

Template: ATHLETES

Field	Value
@Sport_discipline	swimmer
@Name	Federica Pellegrini

Template: ATHLETES_CLONED

Field	Value
@Sport_discipline	swimmer
@Name	Federica Pellegrini

As you can see, a new ATHLETES_CLONED record was created, containing the exact same fields of the main ATHLETES record.

The contents of the values array must be:

templateName

where:

templateName is the new record template name.

Warning

If templateName is identical to the original template, an exception will be raised.

clone value

Use clone value to clone and/or modify an extracted value into another predefined field.

For example, consider this template:

TEMPLATE(PERSONAL_DATA)
{
    @NAME,
    @AGE,
    @ADDRESS,
    @NICKNAME
}

If these rules:

SCOPE SENTENCE
{
    IDENTIFY(PERSONAL_DATA)
    {
        @NAME[TYPE(NPH)]
    }

    IDENTIFY(PERSONAL_DATA)
    {
        @NICKNAME[TYPE(NPH)]
    }
}

are applied to this input text:

Hello Alan.

you will get these records:

Template: PERSONAL_DATA

Field	Value
@NAME	Alan

Template: PERSONAL_DATA

Field	Value
@NICKNAME	Alan

With this code:

function onFinalize(result) {
     jsonPlug.jsonPlug(result, {
        action: "clone value",
        jsPathConditionFlag: true ,
        jsPath: "$.match_info.rules.extraction[?(@.template == 'PERSONAL_DATA')].fields[?(@.field == 'NAME')]",
        jsPathAction: "$.match_info.rules.extraction[?(@.template == 'PERSONAL_DATA')].fields[?(@.field == 'NICKNAME')]",
        values: [true, /^(.+)$/, "The bard of Northampton"]
    });
    return result;
}

you will get these records:

Template: PERSONAL_DATA

Field	Value
@NAME	Alan

Template: PERSONAL_DATA

Field	Value
@NICKNAME	The bard of Northampton

As you can see, the value of the field NAME has been turned into The bard of Northampton and moved to the field NICKNAME.

With this other code:

function onFinalize(result) {
     jsonPlug.jsonPlug(result, {
        action: "clone value",
        jsPathConditionFlag: true ,
        jsPath: "$.match_info.rules.extraction[?(@.template == 'PERSONAL_DATA')].fields[?(@.field == 'NAME')]",
        jsPathAction: "$.match_info.rules.extraction[?(@.template == 'PERSONAL_DATA')].fields[?(@.field == 'NICKNAME')]",
        values: [true, "SCRIPT('toLower')"]
    });
    return result;
}

you will get these records:

Template: PERSONAL_DATA

Field	Value
@NAME	Alan

Template: PERSONAL_DATA

Field	Value
@NICKNAME	alan

As you can see, the value of the field NAME has been turned into the same value but in lowercase—thanks to the toLower built-in function—and moved to the field NICKNAME.

The contents of the values array in case of modification of the cloned value must be:

replaceFlag, regularExpression, replacementString

or:

replaceFlag, scriptingFunctions

where:

replaceFlag is a boolean with the value of true allowing you to apply a regular expression or a script to modify the extracted value.
regularExpression is the regular expression that determines the parts of the value to change.
replacementString is the replacement string where placeholders like $1, $2, etc. can be used to refer to the capturing groups of the regular expression.
scriptingFunctions corresponds to the scripting functions with eventual parameters called with the same syntax of the SCRIPT attribute to further modify the output.

Note

It is not allowed to use regular expressions in combination with scripting functions. However, it is possible to use regular expressions within a scripting function.

In case of value clonation, the values array must be left empty.

clone instances

Use clone instances to replace the normalized field values with the extracted textual values.

This method can be very useful when used in combination with tagging and/or transformation.

For example, consider this template and tag:

TEMPLATE(PERSONAL_DATA)
{
    @Name,
    @Age,
    @Address,
    @Job_type
}

TAGS
{
    @TAG1
}

If these rules:

SCOPE SENTENCE
{
      TAGGER()
    {
        @TAG1[LEMMA("developer", "software developer")]
    }

     IDENTIFY(PERSONAL_DATA)
    {
        @Job_type[TAG(TAG1)]|[TAG]
    }
}

are applied to this input text:

Marco is a developer and Jonathan and Mary are also software developers.

you will get this record:

Template: PERSONAL_DATA

Field	Value
@Job_type	TAG1

With this code:

   function onFinalize(result) {
    jsonPlug.jsonPlug(result, {
        action: "clone instances",
        jsPathConditionFlag: true,
        jsPath: "$.match_info.rules.extraction[?(@.template == 'PERSONAL_DATA')].fields[?(@.field == 'Job_type')].value",
        jsPathAction: "#this#",
        values: ["longest instance", true, /^((software) developer(s)?)$/gi, "$2 dev$3."]
    });
    return result;
}

you will get this record:

Template: PERSONAL_DATA

Field	Value
@Job_type	software devs.

As you can see, the longest textual value software developers was extracted and turned into software devs.

The contents of the values array must be:

instanceType, replaceFlag, regularExpression, replacementString

where:

instanceType is a flag that establishes which text value will be copied. It can be:
- all instances: clone all text instances separated by a pipe character (|).
- longest instance: clone the first longest instance of the text values.
- first instance: clone the first instance of the text values.
replaceFlag is a boolean, it can be:
- false: clone the text values as they are.
- true: apply a regular expression and the replacement string.
regularExpression is the regular expression that determines the parts of the value to change.
replacementString is the replacement string where placeholders like $1, $2, etc. can be used to refer to the capturing groups of the regular expression.

Note

The last two parameters must be inserted if replaceFlag is set to true.