Skip to content

clone actions

Introduction

Here follows a list of the following actions described in the page:

  • clone
  • clone new
  • clone new multiple
  • clone value
  • clone instances

Note

In case of clonations based on instances with different confidence scores, the field confidence score of such clonations is calculated with the same formula used for the confidence of instances. This happens with all actions except for clone instances.

clone

Use clone to clone extraction fields and optionally modify the cloned values.

For example, consider this template:

TEMPLATE(PERSONAL_DATA)
{
    @Name,
    @Date_of_birth,
    @Phone_number,
    @Address,
    @Age,
    @Job,
    @Type_of_job
}

If the following rule:

SCOPE SENTENCE
{
    IDENTIFY(PERSONAL_DATA)
    {
        @Name[TYPE(NPH)]
        <>
        @Job[LEMMA("software engineer")]
    }
}

is applied to this input text:

Jane works as a software engineer.

you will get:

With this code:

function onFinalize(result) {
    jsonPlug.jsonPlug(result, "clone", true, "$..extraction[?(@.template == 'PERSONAL_DATA')]..fields[?(@.field == 'Job')]", "#this#", true, ["Type_of_job", "no tokens", "SCRIPT('toUpper')"]);
    return result;
}

you will get:

As you can see, the Job field was cloned into the Type_of_job field whose value has been converted into uppercase thanks to the toUpper built-in function.

As a different example, consider the same template, but these rules:

SCOPE SENTENCE
{
    IDENTIFY(PERSONAL_DATA)
    {
        @Name[TYPE(NPH)]
        <>
        @Job[LEMMA("software engineer")]
    }


     IDENTIFY(PERSONAL_DATA)
    {
        @Name[TYPE(NPH)]
        <>
        @Job[KEYWORD("knowledge engineer")]
    }


    IDENTIFY(PERSONAL_DATA)
    {
        @Name[TYPE(NPH)]
    }
}

If they are applied to this input text:

Jane works as a software engineer. 
Mary is a knowledge engineer.
James is qualified for both jobs.

you will get:

With this code:

function onFinalize(result) {
     jsonPlug.jsonPlug(result, "clone", true, "$..extraction[?(@.template == 'PERSONAL_DATA')].fields[?(@.field == 'Job')]",
     [true, "$..extraction[?(@.template == 'PERSONAL_DATA')]",
     false, ".fields[?(@.field == 'Job')]"],
      true, ["Type_of_job", "no tokens"]);
    return result;
}

you will get:

As you can see, both values of the Job field were cloned into the Type_of_job field in the record that initially had the Name field only.

In a common use case, the new field is then processed to change its value. You can see an example of this in the description of the modify action.

The contents of the values array must be:

fieldName, cloneOption[, regularExpression, replacementString]

or:

fieldName, cloneOption[, replacementString]

or:

fieldName, cloneOption[, scriptingFunctions]

Note

The parts in square brackets are optional.

where:

  • fieldName is either the new field name or an empty string meaning that the cloned filed will have the name of the source field.
  • cloneOption can be:
    • no tokens: the new field will have no references to the rule(s) that determined the extraction and to the text that triggered the rule(s).
    • clone from source or an empty string: the new field has the same information—in terms of triggered rules and triggering text—of the source field.
    • clone from sibling: for cases in which jspathaction is used to select sibling nodes, the new field has the same information, in terms of triggered rules and triggering text, of the field selected by jspathaction.
  • regularExpression is the regular expression that determines the parts of the node value to change.
  • replacementString is the replacement string where placeholders like $1, $2, etc. can be used to refer to the capturing groups of the regular expression.
  • scriptingFunctions corresponds to the scripting functions with eventual parameters called with the same syntax of the SCRIPT attribute to further modify the output.

Note

It is not allowed to use regular expressions in combination with scripting functions.

clone new

Use clone new to create a single record of a predefined template containing a clone of an existing field.

For example, consider these templates:

TEMPLATE(ATHLETES)
{
    @Name,
    @Sport_discipline
}

TEMPLATE(OLYMPIC_CHAMPIONS)
{
    @Proper_name
}

If this rule:

SCOPE SENTENCE
{
    IDENTIFY(ATHLETES)
    {
        @Name[TYPE(NPH)]
        <>
        @Sport_discipline[LEMMA("swimmer")]
    }
}

is applied to the following input text:

Federica Pellegrini is one of the best swimmers of all time.

you will get:

With this code:

function onFinalize(result) {
    jsonPlug.jsonPlug(result, "clone new", true, "$..extraction[?(@.template == 'ATHLETES')]..fields[?(@.field == 'Name')]", "#this#", true, ["OLYMPIC_CHAMPIONS", "Proper_name", "no tokens", "SCRIPT('toLower')"]);
    return result;
}

you will get:

As you can see, a new OLYMPIC_CHAMPIONS record was created with the Proper_name field having the value of the field identified by the jspath expression. The field value has been converted into lowercase thanks to the toLower built-in function.

The contents of the values array must be:

templateName, fieldName, cloneOption[, scriptingFunctions]

where:

  • templateName is the new record template name.
  • fieldName is either the new field name or an empty string meaning that the cloned filed will have the name of the source field.
  • cloneOption can be:
    • no tokens: the new field will have no references to the rule(s) that determined the extraction and to the text that triggered the rule(s).
    • clone from source or an empty string: the new field has the same information—in terms of triggered rules and triggering text—of the source field.
    • clone from sibling: for cases in which jspathaction is used to select sibling nodes, the new field has the same information, in terms of triggered rules and triggering text, of the field selected by jspathaction.
  • scriptingFunctions corresponds to the scripting functions with eventual parameters called with the same syntax of the SCRIPT attribute to further modify the output.

clone new multiple

Like clone new, with the difference that different records are created for each value of the field to be cloned.

For example, consider these templates:

TEMPLATE(ATHLETES)
{
    @Name,
    @Sport_discipline
}

TEMPLATE(OLYMPIC_CHAMPIONS)
{
    @Proper_name
}

If this rule:

SCOPE SENTENCE
{
    IDENTIFY(ATHLETES)
    {
        @Name[TYPE(NPH)]
        <>
        @Sport_discipline[LEMMA("swimmer")]
    }
}

is applied to this text:

Federica Pellegrini is one of the best swimmers of all time.
Michael Phelps is also one of the best swimmers of all time.

you will get:

With each one of these codes:

function onFinalize(result) {
    jsonPlug.jsonPlug(result, "clone new", true, "$..extraction[?(@.template == 'ATHLETES')]..fields[?(@.field == 'Name')]", "#this#", true, ["OLYMPIC_CHAMPIONS", "Proper_name", "no tokens"]);
    return result;
}

Or:

function onFinalize(result) {
    jsonPlug.jsonPlug(result, "clone new multiple", true, "$..extraction[?(@.template == 'ATHLETES')]..fields[?(@.field == 'Name')]", "#this#", true, ["OLYMPIC_CHAMPIONS", "Proper_name", "no tokens"]);
    return result;
}

you will get:

Clone new Clone new multiple

While a single OLYMPIC_CHAMPIONS record was created for both field values with clone new, two different records were created with clone new multiple to separate both values.

The contents of the values array are the same described in clone new.

clone value

Use clone value to clone and/or modify an extracted value into another predefined field.

For example, consider this template:

TEMPLATE(PERSONAL_DATA)
{
    @NAME,
    @AGE,
    @ADDRESS,
    @NICKNAME
}

If these rules:

SCOPE SENTENCE
{
    IDENTIFY(PERSONAL_DATA)
    {
        @NAME[TYPE(NPH)]
    }

    IDENTIFY(PERSONAL_DATA)
    {
        @NICKNAME[TYPE(NPH)]
    }
}

are applied to this input text:

Hello Alan.

you will get:

With this code:

function onFinalize(result) {
     jsonPlug.jsonPlug(result, "clone value", true ,
     "$..extraction[?(@.template == 'PERSONAL_DATA')].fields[?(@.field == 'NAME')]",
     "$..extraction[?(@.template == 'PERSONAL_DATA')].fields[?(@.field == 'NICKNAME')]",
      true, [true, /^(.+)$/, "The bard of Northampton"]);
    return result;
}

you will get:

As you can see, the value of the field NAME has been turned into The bard of Northampton and moved to the field NICKNAME.

With this other code:

function onFinalize(result) {
     jsonPlug.jsonPlug(result, "clone value", true ,
     "$..extraction[?(@.template == 'PERSONAL_DATA')].fields[?(@.field == 'NAME')]",
     "$..extraction[?(@.template == 'PERSONAL_DATA')].fields[?(@.field == 'NICKNAME')]",
      true, [true, "SCRIPT('toLower')"]);
    return result;
}

you will get:

As you can see, the value of the field NAME has been turned into the same value but in lowercase—thanks to the toLower built-in function—and moved to the field NICKNAME.

The contents of the values array in case of modification of the cloned value must be:

replaceFlag, regularExpression, replacementString

or:

replaceFlag, scriptingFunctions

where:

  • replaceFlag is a boolean with the value of true allowing you to apply a regular expression or a script to modify the extracted value.
  • regularExpression is the regular expression that determines the parts of the value to change.
  • replacementString is the replacement string where placeholders like $1, $2, etc. can be used to refer to the capturing groups of the regular expression.
  • scriptingFunctions corresponds to the scripting functions with eventual parameters called with the same syntax of the SCRIPT attribute to further modify the output.

Note

It is not allowed to use regular expressions in combination with scripting functions.

In case of value clonation, the values array must be left empty.

clone instances

Use clone instances to replace the normalized field values with the extracted textual values.

This method can be very useful when used in combination with tagging and/or transformation.

For example, consider this template and tag:

TEMPLATE(PERSONAL_DATA)
{
    @Name,
    @Age,
    @Address,
    @Job_type
}

TAGS
{
    @TAG1
}

If these rules:

SCOPE SENTENCE
{
      TAGGER()
    {
        @TAG1[LEMMA("developer", "software developer")]
    }

     IDENTIFY(PERSONAL_DATA)
    {
        @Job_type[TAG(TAG1)]|[TAG]
    }
}

are applied to this input text:

Marco is a developer and Jonathan and Mary are also software developers.

you will get:

With this code:

   function onFinalize(result) {
jsonPlug.jsonPlug(result, "clone instances", true, "$..extraction[?(@.template == 'PERSONAL_DATA')]..fields[?(@.field == 'Job_type')].value", "#this#", true, ["longest instance", true, /^((software) developer(s)?)$/gi, "$2 dev$3."]);
    return result;
}

you will get:

As you can see, the longest textual value software developers was extracted and turned into software devs.

The contents of the values array must be:

instanceType, replaceFlag, regularExpression, replacementString

where:

  • instanceType is a flag that establishes which text value will be copied. It can be:
    • all instances: clone all text instances separated by a pipe character (|).
    • longest instance: clone the first longest instance of the text values.
    • first instance: clone the first instance of the text values.
  • replaceFlag is a boolean, it can be:
    • false: clone the text values as they are.
    • true: apply a regular expression and the replacement string.
  • regularExpression is the regular expression that determines the parts of the value to change.
  • replacementString is the replacement string where placeholders like $1, $2, etc. can be used to refer to the capturing groups of the regular expression.

Note

The last two parameters must be inserted if replaceFlag is set to true.