clone actions
Introduction
Here follows a list of the following actions described in the page:
- clone
- clone new
- clone new multiple
- clone value
- clone template
- clone record
- clone instances
Note
In case of clonations based on instances with different confidence scores, the field confidence score of such clonations is calculated with the same formula used for the confidence of instances. This happens with all actions except for clone instances.
clone
Use clone to clone extraction fields and optionally modify the cloned values.
For example, consider this template:
TEMPLATE(PERSONAL_DATA)
{
@Name,
@Date_of_birth,
@Phone_number,
@Address,
@Age,
@Job,
@Type_of_job
}
If the following rule:
SCOPE SENTENCE
{
IDENTIFY(PERSONAL_DATA)
{
@Name[TYPE(NPH)]
<>
@Job[LEMMA("software engineer")]
}
}
is applied to this input text:
Jane works as a software engineer.
you will get this record:
Template: PERSONAL_DATA
Field | Value |
---|---|
@Name | Jane |
@Job | software engineer |
With this code:
function onFinalize(result) {
jsonPlug.jsonPlug(result, {
action: "clone",
jsPathConditionFlag: true,
jsPath: "$.match_info.rules.extraction[?(@.template == 'PERSONAL_DATA')].fields[?(@.field == 'Job')]",
jsPathAction: "#this#",
values: ["Type_of_job", "no tokens", "SCRIPT('toUpper')"]
});
return result;
}
you will get this record:
Template: PERSONAL_DATA
Field | Value |
---|---|
@Type_of_job | SOFTWARE ENGINEER |
@Name | Jane |
@Job | software engineer |
As you can see, the Job field was cloned into the Type_of_job field whose value has been converted into uppercase thanks to the toUpper
built-in function.
As a different example, consider the same template, but these rules:
SCOPE SENTENCE
{
IDENTIFY(PERSONAL_DATA)
{
@Name[TYPE(NPH)]
<>
@Job[LEMMA("software engineer")]
}
IDENTIFY(PERSONAL_DATA)
{
@Name[TYPE(NPH)]
<>
@Job[KEYWORD("knowledge engineer")]
}
IDENTIFY(PERSONAL_DATA)
{
@Name[TYPE(NPH)]
}
}
If they are applied to this input text:
Jane works as a software engineer.
Mary is a knowledge engineer.
James is qualified for both jobs.
you will get these records:
Template: PERSONAL_DATA
Field | Value |
---|---|
@Name | Jane |
@Job | software engineer |
Template: PERSONAL_DATA
Field | Value |
---|---|
@Name | Mary |
@Job | knowledge engineer |
Template: PERSONAL_DATA
Field | Value |
---|---|
@Name | James |
With this code:
function onFinalize(result) {
jsonPlug.jsonPlug(result, {
action: "clone",
jsPathConditionFlag: true,
jsPath: "$.match_info.rules.extraction[?(@.template == 'PERSONAL_DATA')].fields[?(@.field == 'Job')]",
jsPathAction: [
true, "$.match_info.rules.extraction[?(@.template == 'PERSONAL_DATA')]",
false, ".fields[?(@.field == 'Job')]"
],
values: ["Type_of_job", "no tokens"]
});
return result;
}
you will get these records:
Template: PERSONAL_DATA
Field | Value |
---|---|
@Name | Jane |
@Job | software engineer |
Template: PERSONAL_DATA
Field | Value |
---|---|
@Name | Mary |
@Job | knowledge engineer |
Template: PERSONAL_DATA
Field | Value |
---|---|
@Type_of_job | software engineer |
@Type_of_job | knowledge engineer |
@Name | James |
As you can see, both values of the Job field were cloned into the Type_of_job field in the record that initially had the Name field only.
In a common use case, the new field is then processed to change its value. You can see an example of this in the description of the modify action.
The contents of the values
array must be:
fieldName, cloneOption[, regularExpression, replacementString]
or:
fieldName, cloneOption[, replacementString]
or:
fieldName, cloneOption[, scriptingFunctions]
Note
The parts in square brackets are optional.
where:
fieldName
is either the new field name or an empty string meaning that the cloned filed will have the name of the source field.cloneOption
can be:- no tokens: the new field will have no references to the rule(s) that determined the extraction and to the text that triggered the rule(s).
- clone from source or an empty string: the new field has the same information—in terms of triggered rules and triggering text—of the source field.
- clone from sibling: for cases in which
jspathaction
is used to select sibling nodes, the new field has the same information, in terms of triggered rules and triggering text, of the field selected byjspathaction
.
regularExpression
is the regular expression that determines the parts of the node value to change.replacementString
is the replacement string where placeholders like$1
,$2
, etc. can be used to refer to the capturing groups of the regular expression.scriptingFunctions
corresponds to the scripting functions with eventual parameters called with the same syntax of theSCRIPT
attribute to further modify the output.
Note
It is not allowed to use regular expressions in combination with scripting functions. However, it is possible to use regular expressions within a scripting function.
Note
If you clone a field into a record which already contains a field with an identical name and the same extracted value, both will be merged into a single one inheriting all their instances.
clone new
Use clone new to create a single record of a predefined template containing a clone of an existing field.
For example, consider these templates:
TEMPLATE(ATHLETES)
{
@Name,
@Sport_discipline
}
TEMPLATE(OLYMPIC_CHAMPIONS)
{
@Proper_name
}
If this rule:
SCOPE SENTENCE
{
IDENTIFY(ATHLETES)
{
@Name[TYPE(NPH)]
<>
@Sport_discipline[LEMMA("swimmer")]
}
}
is applied to the following input text:
Federica Pellegrini is one of the best swimmers of all time.
you will get this record:
Template: ATHLETES
Field | Value |
---|---|
@Sport_discipline | swimmer |
@Name | Federica Pellegrini |
With this code:
function onFinalize(result) {
jsonPlug.jsonPlug(result, {
action: "clone new",
jsPathConditionFlag: true,
jsPath: "$.match_info.rules.extraction[?(@.template == 'ATHLETES')].fields[?(@.field == 'Name')]",
jsPathAction: "#this#",
values: ["OLYMPIC_CHAMPIONS", "Proper_name", "no tokens", "SCRIPT('toLower')"]
});
return result;
}
you will get these records:
Template: ATHLETES
Field | Value |
---|---|
@Sport_discipline | swimmer |
@Name | Federica Pellegrini |
Template: OLYMPIC_CHAMPIONS
Field | Value |
---|---|
@Proper_name | federica pellegrini |
As you can see, a new OLYMPIC_CHAMPIONS record was created with the Proper_name field having the value of the field identified by the jspath
expression. The field value has been converted into lowercase thanks to the toLower
built-in function.
The contents of the values
array must be:
templateName, fieldName, cloneOption[, scriptingFunctions]
where:
templateName
is the new record template name.fieldName
is either the new field name or an empty string meaning that the cloned filed will have the name of the source field.cloneOption
can be:- no tokens: the new field will have no references to the rule(s) that determined the extraction and to the text that triggered the rule(s).
- clone from source or an empty string: the new field has the same information—in terms of triggered rules and triggering text—of the source field.
- clone from sibling: for cases in which
jspathaction
is used to select sibling nodes, the new field has the same information, in terms of triggered rules and triggering text, of the field selected byjspathaction
.
scriptingFunctions
corresponds to the scripting functions with eventual parameters called with the same syntax of theSCRIPT
attribute to further modify the output.
clone new multiple
Like clone new, with the difference that different records are created for each value of the field to be cloned.
For example, consider these templates:
TEMPLATE(ATHLETES)
{
@Name,
@Sport_discipline
}
TEMPLATE(OLYMPIC_CHAMPIONS)
{
@Proper_name
}
If this rule:
SCOPE SENTENCE
{
IDENTIFY(ATHLETES)
{
@Name[TYPE(NPH)]
<>
@Sport_discipline[LEMMA("swimmer")]
}
}
is applied to this text:
Federica Pellegrini is one of the best swimmers of all time.
Michael Phelps is also one of the best swimmers of all time.
you will get these records:
Template: ATHLETES
Field | Value |
---|---|
@Sport_discipline | swimmer |
@Name | Federica Pellegrini |
Template: ATHLETES
Field | Value |
---|---|
@Sport_discipline | swimmer |
@Name | Michael Phelps |
With each one of these codes:
function onFinalize(result) {
jsonPlug.jsonPlug(result, {
action: "clone new",
jsPathConditionFlag: true,
jsPath: "$.match_info.rules.extraction[?(@.template == 'ATHLETES')].fields[?(@.field == 'Name')]",
jsPathAction: "#this#",
values: ["OLYMPIC_CHAMPIONS", "Proper_name", "no tokens"]
});
return result;
}
Or:
function onFinalize(result) {
jsonPlug.jsonPlug(result, {
action: "clone new multiple",
jsPathConditionFlag: true,
jsPath: "$.match_info.rules.extraction[?(@.template == 'ATHLETES')].fields[?(@.field == 'Name')]",
jsPathAction: "#this#",
values: ["OLYMPIC_CHAMPIONS", "Proper_name", "no tokens"]
});
return result;
}
you will get these records with the first one:
Template: ATHLETES
Field | Value |
---|---|
@Sport_discipline | swimmer |
@Name | Federica Pellegrini |
Template: ATHLETES
Field | Value |
---|---|
@Sport_discipline | swimmer |
@Name | Michael Phelps |
Template: OLYMPIC_CHAMPIONS
Field | Value |
---|---|
@Proper_name | Federica Pellegrini |
@Proper_name | Michael Phelps |
and these records with the second one:
Template: ATHLETES
Field | Value |
---|---|
@Sport_discipline | swimmer |
@Name | Federica Pellegrini |
Template: ATHLETES
Field | Value |
---|---|
@Sport_discipline | swimmer |
@Name | Michael Phelps |
Template: OLYMPIC_CHAMPIONS
Field | Value |
---|---|
@Proper_name | Federica Pellegrini |
Template: OLYMPIC_CHAMPIONS
Field | Value |
---|---|
@Proper_name | Michael Phelps |
While a single OLYMPIC_CHAMPIONS record was created for both field values with clone new, two different records were created with clone new multiple to separate both values.
The contents of the values
array are the same described in clone new.
clone template/clone record
Use clone template or clone record (both are equivalent) to duplicate a record with a different template name. Unlike from the regular clone, this action will replicate all the fields present within the matched record(s).
For example, consider these templates:
TEMPLATE(ATHLETES)
{
@Name,
@Sport_discipline
}
TEMPLATE(ATHLETES_CLONED)
{
@Name,
@Sport_discipline
}
If this rule:
SCOPE SENTENCE
{
IDENTIFY(ATHLETES)
{
@Name[TYPE(NPH)]
<>
@Sport_discipline[LEMMA("swimmer")]
}
}
is applied to the following input text:
Federica Pellegrini is one of the best swimmers of all time.
you will get this record:
Template: ATHLETES
Field | Value |
---|---|
@Sport_discipline | swimmer |
@Name | Federica Pellegrini |
With this code:
function onFinalize(result) {
jsonPlug.jsonPlug(result, {
action: "clone record",
jsPathConditionFlag: true,
jsPath: "$.match_info.rules.extraction[?(@.template == 'ATHLETES')].fields[?(@.field == 'Name' && @.value == 'Federica Pellegrini')]",
jsPathAction: "#this#",
values: ["ATHLETES_CLONED"]
});
return result;
}
you will get these records:
Template: ATHLETES
Field | Value |
---|---|
@Sport_discipline | swimmer |
@Name | Federica Pellegrini |
Template: ATHLETES_CLONED
Field | Value |
---|---|
@Sport_discipline | swimmer |
@Name | Federica Pellegrini |
As you can see, a new ATHLETES_CLONED record was created, containing the exact same fields of the main ATHLETES record.
The contents of the values
array must be:
templateName
where:
templateName
is the new record template name.
Warning
If templateName
is identical to the original template, an exception will be raised.
clone value
Use clone value to clone and/or modify an extracted value into another predefined field.
For example, consider this template:
TEMPLATE(PERSONAL_DATA)
{
@NAME,
@AGE,
@ADDRESS,
@NICKNAME
}
If these rules:
SCOPE SENTENCE
{
IDENTIFY(PERSONAL_DATA)
{
@NAME[TYPE(NPH)]
}
IDENTIFY(PERSONAL_DATA)
{
@NICKNAME[TYPE(NPH)]
}
}
are applied to this input text:
Hello Alan.
you will get these records:
Template: PERSONAL_DATA
Field | Value |
---|---|
@NAME | Alan |
Template: PERSONAL_DATA
Field | Value |
---|---|
@NICKNAME | Alan |
With this code:
function onFinalize(result) {
jsonPlug.jsonPlug(result, {
action: "clone value",
jsPathConditionFlag: true ,
jsPath: "$.match_info.rules.extraction[?(@.template == 'PERSONAL_DATA')].fields[?(@.field == 'NAME')]",
jsPathAction: "$.match_info.rules.extraction[?(@.template == 'PERSONAL_DATA')].fields[?(@.field == 'NICKNAME')]",
values: [true, /^(.+)$/, "The bard of Northampton"]
});
return result;
}
you will get these records:
Template: PERSONAL_DATA
Field | Value |
---|---|
@NAME | Alan |
Template: PERSONAL_DATA
Field | Value |
---|---|
@NICKNAME | The bard of Northampton |
As you can see, the value of the field NAME has been turned into The bard of Northampton and moved to the field NICKNAME.
With this other code:
function onFinalize(result) {
jsonPlug.jsonPlug(result, {
action: "clone value",
jsPathConditionFlag: true ,
jsPath: "$.match_info.rules.extraction[?(@.template == 'PERSONAL_DATA')].fields[?(@.field == 'NAME')]",
jsPathAction: "$.match_info.rules.extraction[?(@.template == 'PERSONAL_DATA')].fields[?(@.field == 'NICKNAME')]",
values: [true, "SCRIPT('toLower')"]
});
return result;
}
you will get these records:
Template: PERSONAL_DATA
Field | Value |
---|---|
@NAME | Alan |
Template: PERSONAL_DATA
Field | Value |
---|---|
@NICKNAME | alan |
As you can see, the value of the field NAME has been turned into the same value but in lowercase—thanks to the toLower
built-in function—and moved to the field NICKNAME.
The contents of the values array in case of modification of the cloned value must be:
replaceFlag, regularExpression, replacementString
or:
replaceFlag, scriptingFunctions
where:
replaceFlag
is a boolean with the value oftrue
allowing you to apply a regular expression or a script to modify the extracted value.regularExpression
is the regular expression that determines the parts of the value to change.replacementString
is the replacement string where placeholders like$1
,$2
, etc. can be used to refer to the capturing groups of the regular expression.scriptingFunctions
corresponds to the scripting functions with eventual parameters called with the same syntax of theSCRIPT
attribute to further modify the output.
Note
It is not allowed to use regular expressions in combination with scripting functions. However, it is possible to use regular expressions within a scripting function.
In case of value clonation, the values array must be left empty.
clone instances
Use clone instances to replace the normalized field values with the extracted textual values.
This method can be very useful when used in combination with tagging and/or transformation.
For example, consider this template and tag:
TEMPLATE(PERSONAL_DATA)
{
@Name,
@Age,
@Address,
@Job_type
}
TAGS
{
@TAG1
}
If these rules:
SCOPE SENTENCE
{
TAGGER()
{
@TAG1[LEMMA("developer", "software developer")]
}
IDENTIFY(PERSONAL_DATA)
{
@Job_type[TAG(TAG1)]|[TAG]
}
}
are applied to this input text:
Marco is a developer and Jonathan and Mary are also software developers.
you will get this record:
Template: PERSONAL_DATA
Field | Value |
---|---|
@Job_type | TAG1 |
With this code:
function onFinalize(result) {
jsonPlug.jsonPlug(result, {
action: "clone instances",
jsPathConditionFlag: true,
jsPath: "$.match_info.rules.extraction[?(@.template == 'PERSONAL_DATA')].fields[?(@.field == 'Job_type')].value",
jsPathAction: "#this#",
values: ["longest instance", true, /^((software) developer(s)?)$/gi, "$2 dev$3."]
});
return result;
}
you will get this record:
Template: PERSONAL_DATA
Field | Value |
---|---|
@Job_type | software devs. |
As you can see, the longest textual value software developers was extracted and turned into software devs.
The contents of the values
array must be:
instanceType, replaceFlag, regularExpression, replacementString
where:
instanceType
is a flag that establishes which text value will be copied. It can be:all instances
: clone all text instances separated by a pipe character (|
).longest instance
: clone the first longest instance of the text values.first instance
: clone the first instance of the text values.
replaceFlag
is a boolean, it can be:false
: clone the text values as they are.true
: apply a regular expression and the replacement string.
regularExpression
is the regular expression that determines the parts of the value to change.replacementString
is the replacement string where placeholders like$1
,$2
, etc. can be used to refer to the capturing groups of the regular expression.
Note
The last two parameters must be inserted if replaceFlag
is set to true
.