extraction property
result.match_info.rules.extraction
is an array containing the results of extraction.
Each array item represents an extraction record and has the following properties:
Property | Description |
---|---|
template |
Extraction's template |
fields |
Extraction's fields |
fields
is an array. Each item represents a template's field and has the following properties:
Property | Description |
---|---|
field |
Field name |
value |
Field value |
instance |
Field instances |
confidence |
Field confidence score |
instance
is an array. Each item represents an instance of the field and has the following properties:
Property | Description |
---|---|
group_by |
When two instances of different fields of the same record have the same value for this property, they must be considered as an aggregate |
text |
Field instance text |
pos |
Zero based position of the field instance text |
len |
Length of the field instance text |
snt |
Sentence number |
snt_begin |
Sentence initial position in the text |
snt_end |
Sentence final position in the text |
syncon |
Syncon ID |
ancestor |
Ancestor ID |
rule_details |
Rule details |
confidence |
Instance confidence score |
rule_details
is an array. Its items have the following properties:
Property | Description |
---|---|
id |
Rule ID is a rule identification number created during the project building. It is a compiled rule index of an array where the rules are placed. It changes after every building. |
label |
Rule label, if any |
For example, consider the following text:
BMW released Tuesday the details of an electric concept car, with production of the vehicle expected to start in 2021.
In an interview with CNBC Tuesday, CEO Oliver Zipse described the BMW Concept i4 vehicle as bringing "electromobility to the heart of the BMW brand".
The firm is one of several major manufacturers developing an electric vehicle offering to challenge electric car makers like Tesla.
and the rule:
SCOPE SENTENCE
{
IDENTIFY(BRANDS)
{
@BRAND[ANCESTOR(376882)] //@SYN: #376882# [tag_all_brands]
}
}
the extraction
property has the following JSON serialization:
"extraction": [
{
"template": "BRANDS",
"fields": [
{
"field": "BRAND",
"value": "BMW",
"instance": [
{
"group_by": 0,
"text": "BMW",
"rule_details": [
{
"id": 1,
"label": ""
}
],
"pos": 0,
"len": 3,
"snt": 1,
"snt_begin": 0,
"snt_end": 117,
"syncon": 1039566,
"ancestor": -1
}
]
}
],
},
{
"template": "BRANDS",
"fields": [
{
"field": "BRAND",
"value": "Tesla (Veicoli)",
"instance": [
{
"group_by": 1000000,
"text": "Tesla",
"rule_details": [
{
"id": 1,
"label": ""
}
],
"pos": 394,
"len": 5,
"snt": 3,
"snt_begin": 269,
"snt_end": 399,
"syncon": 1001728,
"ancestor": -1
}
]
}
],
In that context, the following code:
function onFinalize(result) {
var extractionsCount = result.match_info.rules.extraction.length;
var extraction;
var fieldsCount;
for (i=0; i < extractionsCount; i++)
{
extraction = result.match_info.rules.extraction[i];
fieldsCount = extraction.fields.length;
for(j=0; j < fieldsCount; j++)
{
if(extraction.fields[j].field == "BRAND" && extraction.fields[j].value == "Tesla (Veicoli)")
{
extraction.fields[j].value = "Tesla (Vehicles)";
}
}
}
return result;
}
changes the value of the BRAND field from Tesla (Veicoli) to Tesla (Vehicles). The two couples of tables below show extraction results as they appear in Studio without and with the manipulation.
Template: BRANDS
Field | Value |
---|---|
@BRAND | BMW |
Template: BRANDS (without manipulation)
Field | Value |
---|---|
@BRAND | Tesla (Veicoli) |
Template: BRANDS
Field | Value |
---|---|
@BRAND | BMW |
Template: BRANDS (with manipulation)
Field | Value |
---|---|
@BRAND | Tesla (Vehicles) |