delete/modify actions
Introduction
Here follows a list of the following actions described in the page:
- delete
- delete template
- delete record
- modify
- modify regex
delete
Use delete to delete extraction fields, a record or a category.
For example, consider this template:
TEMPLATE(PERSONAL_DATA)
{
@Name,
@Date_of_birth,
@Phone_number,
@Address,
@Job,
@Type_of_job
}
If this rule:
SCOPE SENTENCE
{
IDENTIFY(PERSONAL_DATA)
{
@Name[TYPE(NPH)]
<>
@Phone_number[TYPE(PHO)]
}
}
is applied to the following input text:
Stephen King's number is 0000000000.
you will get this record:
Template: PERSONAL_DATA
Field | Value |
---|---|
@Phone_number | 0000000000 |
@Name | Stephen King |
With this code:
function onFinalize(result) {
jsonPlug.jsonPlug(result, {
action: "delete",
jsPathConditionFlag: true,
jsPath: "$.match_info.rules.extraction[?(@.template == 'PERSONAL_DATA')].fields[?(@.field == 'Name')]",
jsPathAction: "#this#"
});
return result;
}
you will get this record:
Template: PERSONAL_DATA
Field | Value |
---|---|
@Phone_number | 0000000000 |
As you can see, the Name field belonging to the PERSONAL_DATA record was deleted.
The values
parameter is ignored by this action and can be left empty.
delete template/delete record
Use delete template or delete record to delete whole records.
For example, consider this template:
TEMPLATE(PERSONAL_DATA)
{
@Name,
@Date_of_birth,
@Phone_number,
@Address,
@Job,
@Type_of_job
}
If the rule used for delete is applied to the same input text as above, with this code:
function onFinalize(result) {
jsonPlug.jsonPlug(result, {
action: "delete template",
jsPathConditionFlag: true,
jsPath: "$.match_info.rules.extraction[?(@.template == 'PERSONAL_DATA')].fields[?(@.field == 'Name')]",
jsPathAction: "#this#"
});
return result;
}
or with this code:
function onFinalize(result) {
jsonPlug.jsonPlug(result, {
action: "delete record",
jsPathConditionFlag: true,
jsPath: "$.match_info.rules.extraction[?(@.template == 'PERSONAL_DATA')].fields[?(@.field == 'Name')]",
jsPathAction: "#this#"
});
return result;
}
all the PERSONAL_DATA records containing a Name field are deleted.
The values
parameter is ignored by this action and can be left empty.
modify
Use modify to change:
- Category names
- Category labels
- Category
winner
keys - Record
template
names - Field names
- Field values
- Field confidence scores
Note
You can modify these values by providing a static value or by invoking a custom function using the SCRIPT
attribute.
For example, consider these templates differing only in the name:
TEMPLATE(PERSONAL_DATA)
{
@Name,
@Date_of_birth,
@Geographical_location,
@Main_works
}
TEMPLATE(COMIC_WRITERS)
{
@Name,
@Date_of_birth,
@Geographical_location,
@Main_works
}
If the following rule:
SCOPE SENTENCE
{
IDENTIFY(PERSONAL_DATA)
{
@Name[TYPE(NPH)]
<>
@Date_of_birth[TYPE(DAT)]
<>
@Geographical_location[SYNCON(100192240)] //@SYN: #100192240# [Northampton]
}
}
is applied to this input text:
Mr. Alan Moore was born on the 18th of November 1953 in Northampton.
you will get this record:
Template: PERSONAL_DATA
Field | Value |
---|---|
@Name | Alan Moore |
@Geographical_location | Northampton |
@Date_of_birth | Nov-18-1953 |
With this code:
function onFinalize(result) {
jsonPlug.jsonPlug(result, {
action: "modify",
jsPathConditionFlag: true,
jsPath: "$.match_info.rules.extraction[?(@.template == 'PERSONAL_DATA')].template",
jsPathAction: "#this#",
values: ["COMIC_WRITERS"]
});
return result;
}
or with this one:
function onFinalize(result) {
jsonPlug.jsonPlug(result, {
action: "modify",
jsPathConditionFlag: true,
jsPath: "$.match_info.rules.extraction[?(@.template == 'PERSONAL_DATA')].template",
jsPathAction: "#this#",
values: "COMIC_WRITERS"
});
return result;
}
you will get this record:
Template: COMIC_WRITERS
Field | Value |
---|---|
@Name | Alan Moore |
@Geographical_location | Northampton |
@Date_of_birth | Nov-18-1953 |
Note
If you modify a field which ends up having the same name and the same extracted value of another sibling field, both will be merged into a single one inheriting all their instances.
As you can see, the record template name has changed from PERSONAL_DATA to COMIC_WRITERS.
values
must contain, in case of a value replacement with a new one, the new value for the selected nodes. values
can be an array or a string.
The same output can also be obtained with the following code using the multiple
string as jsPathConditionFlag
:
function onFinalize(result) {
jsonPlug.jsonPlug(result, {
action: "modify",
jsPathConditionFlag: "multiple",
jsPath: [
false, "$.match_info.rules.extraction[?(@.template == 'COMIC_WRITERS')]",
true, "$.match_info.rules.extraction[?(@.template == 'PERSONAL_DATA')]",
true, ".fields[?(@.field == 'Date_of_birth' && @.value == 'Nov-18-1953')]",
true, "^.template"
],
jsPathAction: "#this#",
values: "COMIC_WRITERS",
skipNameValidation: true
});
return result;
}
With this example code, the record template name will be modified only if:
- There is not an initial record named COMIC_WRITERS.
- There is an initial record named PERSONAL_DATA.
- There is the Date_of_birth field with the value—at its base form—Nov-18-1953.
- If these conditions are satisfied,
^.template
goes back to the template level modifying its name.
Note
- In case of absolute paths, the last
true
will define the final match of the paths array. - However, each
true
orfalse
validation based on a relative path will act as a filter to the previous absolute path. Multiple relative validations can be used; this allows to quickly apply complex validations on the extraction results. - Should any
true
validation return no valid matches, the whole validation will fail and no change will be applied to theresult
object.
You can also use this action to alter the confidence score.
The extracted fields in the example above all have a field confidence score of 1.0.
With this code:
function onFinalize(result) {
jsonPlug.jsonPlug(result, {
action: "modify",
jsPathConditionFlag: true,
jsPath: "$.match_info.rules.extraction[?(@.template == 'PERSONAL_DATA')].fields[0:].confidence",
jsPathAction: "#this#",
values: "0.40"
});
return result;
}
you will get field confidence scores of 0.40.
values
must contain one item that is the new score for the selected fields. values
can be an array, a string or a digit.
With this other code:
function onFinalize(result) {
jsonPlug.jsonPlug(result, {
action: "modify",
jsPathConditionFlag: true,
jsPath: "$.match_info.rules.extraction[?(@.template == 'PERSONAL_DATA')].fields[?(@.field == 'Geographical_location')].value",
jsPathAction: "#this#",
values: "SCRIPT('toUpper')"
});
return result;
}
or with this one:
function onFinalize(result) {
jsonPlug.jsonPlug(result, {
action: "modify",
jsPathConditionFlag: true,
jsPath: "$.match_info.rules.extraction[?(@.template == 'PERSONAL_DATA')].fields[?(@.field == 'Geographical_location')].value",
jsPathAction: "#this#",
values: ["SCRIPT('toUpper')"]
});
return result;
}
you will get this record:
Template: PERSONAL_DATA
Field | Value |
---|---|
@Name | Alan Moore |
@Geographical_location | NORTHAMPTON |
@Date_of_birth | Nov-18-1953 |
As you can see, the Geographical_location field value has been turned into uppercase thanks to the toUpper
built-in function.
Clearl, you have the flexibility to utilize both the built-in functions and any custom functions specifically developed for your project.
values
must contain—in case of scripting functions to modify the initial output value with no new values—the scripting functions with an optional parameter to be called with the same syntax of the SCRIPT
attribute.
Note
It is not allowed to use regular expressions within the values
array while using the modify
action. However, you have the flexibility to employ regular expressions within the invoked function(s).
In jsonPlug, there is a distinction in how the SCRIPT() attribute behaves compared to its behavior inside rules. Specifically, when used inside rules, SCRIPT() automatically receives, as its first parameter, the index of the matched token. However, when employed with jsonPlug, it takes a different form.
Within the jsonPlug context, the SCRIPT() attribute's first parameter is a string. This string is, in fact, an object containing information equivalent to the paths
output of the queryJsonPath
method. This design allows for seamless integration of scripting functions with jsonPlug, facilitating advanced data manipulation and extraction.
modify regex
Like modify with the only difference that a regular expression is used to determine which parts of the value have to be replaced.
For example, consider the following templates:
TEMPLATE(PERSONAL_DATA)
{
@Name,
@Date_of_birth,
@Geographical_location,
@Main_works
}
TEMPLATE(PERSONAL_DATA_TEST)
{
@Name,
@Date_of_birth,
@Geographical_location,
@Main_works
}
If this rule:
SCOPE SENTENCE
{
IDENTIFY(PERSONAL_DATA)
{
@Name[TYPE(NPH)]
<>
@Date_of_birth[TYPE(DAT)]
<>
@Geographical_location[SYNCON(100192240)] //@SYN: #100192240# [Northampton]
}
}
is applied to this input text:
Mr. Alan Moore was born on the 18th of November 1953 in Northampton.
you will get this record:
Template: PERSONAL_DATA
Field | Value |
---|---|
@Name | Alan Moore |
@Geographical_location | Northampton |
@Date_of_birth | Nov-18-1953 |
This code:
function onFinalize(result) {
jsonPlug.jsonPlug(result, {
action: "modify regex",
jsPathConditionFlag: true,
jsPath: "$.match_info.rules.extraction[?(@.template == 'PERSONAL_DATA')].template",
jsPathAction: "#this#",
values: [/^(.+)$/, "$1_TEST"]
});
return result;
}
will produce this change:
Template: PERSONAL_DATA_TEST
Field | Value |
---|---|
@Name | Alan Moore |
@Geographical_location | Northampton |
@Date_of_birth | Nov-18-1953 |
Note
If you modify a field which ends up having the same name and the same extracted value of another sibling field, both will be merged into a single one inheriting all their instances.
As you can see, the record name has changed. The new one is a concatenation of the first regular expression capturing group plus _TEST.
The contents of the values
array must be:
regularExpression, replacementString
where:
regularExpression
is the regular expression that determines the parts of the node value to change.replacementString
is the replacement string where placeholders like$1
,$2
, etc. can be used to refer to the capturing groups of the regular expression.