linkPost
Overview
The linkPost module allows you to modify extraction results by:
- Copying or moving a field from a record to another.
- Removing records when some fields are missing.
- Eliminate redundant fields from records.
The module has these methods:
LINK_FIELDVALIDATE_FIELDREMOVE_WEAK_FIELDloadapplygetLastErrorclose
When you install the linkPost module in your project, Studio modifies the main.jr file to insert this statement at the beginning of the file:
var linkPost = require("modules/linkPost");
The statement above sets the linkPost variable with an instance of the module so that you can use its methods inside event handling functions.
LINK_FIELD, VALIDATE_FIELD, REMOVE_WEAK_FIELD and apply must be used in the onFinalize function, because they act on the analysis results available when this function is run.
The load method must be used in the initialize function, because it is the right place for the initialization of objects needed in other event handling functions.
The getLastError method must be used together with the load method.
The close method must be used in the shutdown function, because it's the rght place to free up the resources allocated by the module.
LINK_FIELD
The LINK_FIELD method copies a field from a record to another record of a different template having a specific "attractor" field, optionally deleting the field from the source record.
The template of the destination record must have a field with the same name of the source field.
Use this method in the onFinalize event handling function, where extraction results are available.
For example, if you have these templates:
TEMPLATE(SUPERHEROES)
{
@HUMAN_FULL_NAME,
@JOB,
@SUPERHERO_NAME,
@SUPER_POWER,
@PUBLISHER
}
TEMPLATE(COMIC_PUBLISHER_AND_WRITERS)
{
@PUBLISHER,
@WRITER
}
and these rules:
SCOPE PARAGRAPH
{
IDENTIFY(SUPERHEROES)
{
@HUMAN_FULL_NAME[TYPE(NPH)]
<>
@JOB[LEMMA("lawyer")]
<>
@SUPERHERO_NAME[KEYWORD("daredevil")]
<>
@SUPER_POWER[KEYWORD("enhanced sense", "enhanced senses")]
}
IDENTIFY(COMIC_PUBLISHER_AND_WRITERS)
{
@PUBLISHER[SYNCON(100173496)]//@SYN: #100173496# [Marvel Comics Group]
<>
@WRITER[KEYWORD("brian michael bendis")]
}
}
applied to this text:
Matt Murdock is a lawyer by day and Daredevil by night, a blind superhero but with other extremely enhanced senses. Daredevil is published by Marvel Comics and his best writer is Brian Michael Bendis.
you will get these records:
Template: COMIC_PUBLISHER_AND_WRITERS
| Field | Value |
|---|---|
| @PUBLISHER | Marvel Comics |
| @WRITER | Brian Michael Bendis |
Template: SUPERHEROES
| Field | Value |
|---|---|
| @HUMAN_FULL_NAME | Matt Murdock |
| @JOB | lawyer |
| @SUPERHERO_NAME | Daredevil |
| @SUPER_POWER | enhanced senses |
With this code:
function onFinalize(result) {
linkPost.LINK_FIELD(result, {
sourceFieldName: "PUBLISHER",
sourceRecordTemplate: "COMIC_PUBLISHER_AND_WRITERS",
destinationRecordTemplate: "SUPERHEROES",
attractorName: "SUPERHERO_NAME",
sourceFieldValue: "*",
attractorValue: "*",
scope: "paragraph",
deleteFlag: true
});
return result;
}
or with this other one:
function onFinalize(result) {
linkPost.LINK_FIELD(result, "COMIC_PUBLISHER_AND_WRITERS", "PUBLISHER", "*", "SUPERHEROES", "SUPERHERO_NAME", "*", "paragraph", true);
return result;
}
you will get these records:
Template: COMIC_PUBLISHER_AND_WRITERS
| Field | Value |
|---|---|
| @WRITER | Brian Michael Bendis |
Template: SUPERHEROES
| Field | Value |
|---|---|
| @HUMAN_FULL_NAME | Matt Murdock |
| @JOB | lawyer |
| @SUPERHERO_NAME | Daredevil |
| @SUPER_POWER | enhanced senses |
| @PUBLISHER | Marvel Comics |
As you can see, the PUBLISHER field has been deleted from the COMIC_PUBLISHER_AND_WRITERS record and added to the SUPERHEROES record.
The syntax is:
moduleVariable.LINK_FIELD(result, sourceRecordTemplate, sourceFieldName, sourceFieldValue, destinationRecordTemplate, attractorName, attractorValue, scope, deleteFlag[, segmentName|sectionName])
or:
moduleVariable.LINK_FIELD(result, parameters)
where:
moduleVariableis the variable corresponding to the module and set withrequire().resultis the object containing the analysis results.sourceFieldNameis the name of the source field.sourceRecordTemplateis the template name of the source record.sourceFieldValueis the value of the source field: the field is used only if it matches this value. The asterisk (*) means any value, which is the default value if this parameter is not expressed.destinationRecordTemplateis the template name of the destination record.attractorNameis the name of the attractor field. The attractor is a field that must exist in the destination record in order to "attract" the source filed.attractorValueis the value of the attractor field: the source field is copied/moved only if the attractor field has this value.The asterisk (*) means any value, which is the default value if this parameter is not expressed.-
scope(case insensitive) is the scope from which both the source and the attractor fields must have been extracted. It can be:- document
- section
- paragraph
- sentence
- clause
- phrase
- token
- segment
- segment interval
Use segment if you require the fields to be part of the same segment, no matter the portion of segment. Use segment interval if you want the fields to also come from the same sentence within the segment.
-
deleteFlagis a Boolean, false by default. Set it totrueto remove the field from its source record, thus moving the filed instead of copying it. segmentNamename of segments ifscopeis segment or segment interval. It can be a string for one segment or an array of strings for multiple segments.sectionNamename of sections ifscopeis section. It can be a string for one section or an array of strings for multiple sections.-
parametersis an object whose properties are named parameters. The names of the properties aresourceFieldName,sourceRecordTemplate,destinationRecordTemplate,attractorName,scope,deleteFlagand, optionally,sourceFieldValue,attractorValue,segmentNameandsectionName. The type and meaning of the properties is the same of the positional parameters with the same name described above for the alternative syntax. For optional parameters:sourceFieldValue: if missing, it's as if it's specified with value*.attractorValue: if missing, it's as if it's specified with value*.segmentName: required only ifscopeis segment or segment interval.sectionName: required only ifscopeis section.
segmentName, sectionName and the corresponding named parameters in the alternative syntax support the overlap syntax, which follows the same format as found in the rule scope options. For example, SEGMENT1:SEGMENT2 represents the intersection of SEGMENT1 and SEGMENT2. This syntax can be used also if the argument/parameter is an array, for example:
["SEGMENT1:SEGMENT2", "SEGMENT3"]
but if an intersection of segments is used, the only valid scope is segment interval. Declaring a different scope will trigger an exception.
In the example above, the field was moved from one record to another because:
- The template of the destination record has a field with the same name of the source field.
- The destination record contains the attractor field SUPERHERO_NAME that was extracted from the same scope (paragraph) of the source field.
- The last parameter of the invocation of the method was set to
trueto delete the field from the source record after the copy.
VALIDATE_FIELD
The VALIDATE_FIELD method deletes records from the extraction results when they don't have validation fields. Removal can be inhibited by specifying supplemental fields whose presence counterbalances the absence of the validation fields.
Use this method in the onFinalize event handling function, where extraction results are available.
For example, with this template:
TEMPLATE(PUZZLE)
{
@NAME,
@INVENTOR,
@INVENTION_YEAR
}
and this rule:
SCOPE SENTENCE
{
IDENTIFY(PUZZLE)
{
@NAME[LEMMA("Rubik's cube")]
<>
@INVENTION_YEAR[TYPE(DAT)]
}
}
applied to this input text:
The Rubik's cube was invented in 1975.
you will get this PUZZLE record:
| Field | Value |
|---|---|
| @PUZZLE | Rubik's cube |
| @INVENTION_YEAR | 1975 |
With this code:
function onFinalize(result) {
linkPost.VALIDATE_FIELD(result, {
templateName: "PUZZLE",
validatorFields: "INVENTOR"
});
return result;
}
or with this other:
function onFinalize(result) {
linkPost.VALIDATE_FIELD(result, "PUZZLE", "INVENTOR")
return result;
}
the PUZZLE record is removed from the extraction output because it doesn't contain the INVENTOR field, acting as validation.
The syntax is:
moduleVariable.LINK_FIELD(result, templateName, validationFields[, inhibitionFields]
or:
moduleVariable.LINK_FIELD(result, templateName, parameters
where:
moduleVariableis the variable corresponding to the module and set withrequire().resultis the object containing the analysis results.templateNameis the template name of the record to filter.validationFieldsis the name of the validation field or an array of names of validation fields. If one or more of those fields is missing, the entire record is removed.inhibitionFields(optional) is the name of the inhibition field or an array of names of inhibition fields. If the record is missing validation fields but has one or more inhibition fields, it is not removed.parametersis an object whose properties are named parameters. The names of the properties aretemplateName,validationFieldsand, optionally,inhibitionFields. The type and meaning of the properties is the same of the positional parameters with the same name described above for the alternative syntax.
As an example of inhibition fields, consider the same template, rule and text used above.
This code:
function onFinalize(result) {
linkPost.VALIDATE_FIELD(result, {
templateName: "PUZZLE",
validatorFields: "INVENTOR",
inhibitorFields: "NAME"
});
return result;
}
produces this PUZZLE record:
| Field | Value |
|---|---|
| @CUBE | Rubik's cube |
| @INVENTION_YEAR | 1975 |
because even though validation field INVENTOR is missing, inhibition field NAME is present, so the record cannot be removed.
REMOVE_WEAK_FIELD
The REMOVE_WEAK_FIELD method removes one or more redundant fields—the weak fields—if in the same record there's another field—the strong field—with the same value.
Use this method in the onFinalize event handling function, where extraction results are available.
For example, with this template:
TEMPLATE(GRAMMAR_CLASSES)
{
@ADJECTIVE,
@NOUN
}
and this extraction rule:
SCOPE SENTENCE
{
IDENTIFY(GRAMMAR_CLASSES)
{
@ADJECTIVE[LEMMA("blue")]
OR
@NOUN[LEMMA("blue")]
}
}
applied to this input text:
The sky is blue.
you will get this GRAMMAR_CLASSES record:
| Field | Value |
|---|---|
| @NOUN | blue |
| @ADJECTIVE | blue |
With this code:
function onFinalize(result) {
linkPost.REMOVE_WEAK_FIELD(result, {
templateName: "GRAMMAR_CLASSES",
strongField: "ADJECTIVE",
weakField: "NOUN"
});
return result;
}
or with this other one:
function onFinalize(result) {
linkPost.REMOVE_WEAK_FIELD(result, "GRAMMAR_CLASSES", "ADJECTIVE", "NOUN");
return result;
}
the record will become:
| Field | Value |
|---|---|
| @ADJECTIVE | blue |
As you can see, the NOUN field has been removed.
The syntax is:
moduleVariable.REMOVE_WEAK_FIELD(result, templateName, strongField, weakField [, caseInsensitive])
or:
moduleVariable.REMOVE_WEAK_FIELD(result, parameters)
where:
moduleVariableis the variable corresponding to the module and set withrequire().resultis the object containing the analysis results.templateNameis the template name of the record.strongFieldis the name of the strong field to keep.-
weakFieldis/are the weak field/s to remove. It can be:- If there's only one weak field, the name of the weak field.
- If there are multiple weak fields, an array containing the names of the weak fields.
nullif all the fields but the strong field must considered weak.
-
caseInsensitiveis an optional Boolean, false by default. If set to true, the match between the value of the strong fields and the values of the weak fields is case insensitive. parametersis an object whose properties are named parameters. The names of the properties aretemplateName,strongField,weakFieldand, optionally,caseInsensitive. The type and meaning of the properties is the same of the positional parameters with the same name described above for the alternative syntax.
load
The load method prepares one or more of the operations that can be attained with the methods above, but using as its source a configuration file generated when importing a project created with a legacy edition of Studio. Prepared operations are then applied using the apply method.
Warning
The use of the load method is not required in cases other than those described below and the import procedure already generates the appropriate statements inside the main.jr file, so there are basically no cases in which you have to write code that uses this method.
Use this method in the initialize event handling function.
For example, when importing an old project, Studio may generate this code:
var linkPost = require("modules/linkPost");
function initialize(cmdline) {
if (!linkPost.load('Config.xml')) {
CONSOLE.error(linkPost.getLastError());
return false;
}
return true;
}
function onFinalize(result) {
result = linkPost.apply(result);
return result;
}
The syntax is:
moduleVariable.load(configPath)
where:
moduleVariableis the variable corresponding to the module and set withrequire().configPathis the path of the configuration file generated by the import procedure.
The method returns true in case of success, false otherwise. In case of failure it sets an error message you can retrieve with the getLastError method.
apply
The apply method performs all the operations prepared with the invocation of the load method.
Use this method in the onFinalize event handling function, where extraction results are available.
For example:
function onFinalize(result) {
result = linkPost.apply(result);
return result;
}
The syntax is:
moduleVariable.apply(result)
where:
moduleVariableis the variable corresponding to the module and set withrequire().resultis the object containing the analysis results.
getLastError
The getLastError method retrieves the message corresponding to the last error that occurred when the load method fails. Use it to display the error message.
For example:
function initialize(cmdline) {
if (!linkPost.load('Config.xml'))) {
CONSOLE.error(linkPost.getLastError());
return false;
}
}
The syntax is:
moduleVariable.getLastError()
where moduleVariable is the variable corresponding to the module and set with require().
close
The close method is used to free up the resources allocated by the linkPost module object.
It's not mandatory to invoke this method, but if you decide to do it, invoke it inside the shutdown function.
For example:
function shutdown() {
linkPost.close();
}
The syntax is:
moduleVariable.close()
where moduleVariable is the variable corresponding to the module and set with require().