Skip to content

EXTENSION

EXTENSION is an extraction transformation option that can be described as a completion feature of the matched value rather than a normalization. It adds elements surrounding the original matched data to the final extracted value.

Its action is based on the concept of "extension", which in turn is linked to the SCOPE statement (the part of the rule's syntax in which the portion of text to which a rule applies is specified).

The EXTENSION option reads what the SCOPE statement (found at the beginning of the rule) contains and returns the portion of text containing the value matched by an attribute as defined in the SCOPE statement. In this way, EXTENSION dynamically reproduces what the options PHRASE, CLAUSE, SENTENCE, PARAGRAPH, SEGMENT and SECTION do without having to choose the extension beforehand.

The syntax of the EXTENSION option is the following:

SCOPE scopeOption
{
    IDENTIFY(templateName)
    {
        @field[attribute]|[EXTENSION]
    }
}

This option is useful in cases where it's necessary to expand the extraction output revolving around a matched element, to include exactly what the extension of the rule is; whereas the options PHRASE, CLAUSE, SENTENCE, PARAGRAPH, SEGMENT and SECTION allow the extension of the rule to be different from the transformation option.

Consider the following example:

SCOPE CLAUSE
{
    IDENTIFY(SUBJECT)
    {
        @Subject[TYPE(NPH) + ROLE(SUBJECT)]|[EXTENSION]
    }
}

The purpose of this rule is to extract people's names (TYPE (NPH)), only if the names identified are the subjects of a clause (+ ROLE (SUBJECT)). If this condition is verified, the EXTENSION transformation option will ensure that every extracted value will be expanded to the defined extension in the SCOPE statement of the rule, in this case, a clause.

Consider the extraction output if the rule above is run against the following sample sentence:

Assistant Commissioner Simon Byrne described detectives as "constables in T-shirts and jeans" and said he wanted to end the division between uniformed officers and detectives.

The text contains one value matching the sample rule: Simon Byrne. Simon Byrne is recognized as both a person's name and as the subject in the first clause of the sentence, so it causes the rule to trigger while the EXTENSION transformation determines the extraction of the extension defined in the SCOPE statement of the rule, the clause to which the person's name belongs, which is Assistant Commissioner Simon Byrne described detectives as "constables in T-shirts and jeans".

If the rule's scope is changed to PHRASE and the transformation remains unchanged, the extraction will become the phrase that contains the person's name: Assistant Commissioner Simon Byrne.