Skip to content

SECTOR

SECTOR be described as a "completion feature" applied to the matched value rather than a normalization feature. It adds the elements that surround the original matched data in the input sentence to the final output value.

Its action is similar to the SEQUENCE option and it is based on the concepts of "phrase" and "sequence" (one of the classes of operators available in the Rules language).

The SEQUENCE option returns all the elements included in a rule's sequence along with the value matched by the attribute enclosed in the rule itself. The SECTOR option not only returns the sequence, but it also expands the output to the phrase containing the attributes that are part of the sequence.

The syntax for extraction rules is:

SCOPE scopeOption
{
    IDENTIFY(templateName)
    {
        attribute1
        sequenceOperator
        @field[attribute2]|[SECTOR]
    }
}

The syntax for tagging rules is:

SCOPE scopeOption
{
    TAGGER(tagLevel)
    {
        attribute1
        sequenceOperator
        @tag[attribute2]|[SECTOR]
    }
}

sequenceOperator refers to one of the positional or logical sequence operators available in the Rules language. The operators and the attributes other than the one enclosed in the syntaxes can be positioned before or after the field- or tag-prefixed operand and as many operators and attributes may be used as needed. The SECTOR option must be used in a rule containing at least one sequence.

Consider the following example:

SCOPE SENTENCE
{
    IDENTIFY(TEST)
    {
        TYPE(ADJ)
        >>
        @field_1[LEMMA("virus")]|[SECTOR]
    }
}

The purpose of this rule is to extract the lemma virus, in singular or plural form, only if it appears in a text strictly preceded (double greater than sign , >>) by an adjective (TYPE(ADJ)). If this condition is verified, the SECTOR transformation option will ensure that the extraction value will be expanded to include all elements pertaining to the sequence specified in the rule.

Consider the extraction output if the rule above is run against the following sample text:

Flu Widespread, Leading a Range of Winter's Ills
By DONALD G. McNEIL Jr. and KATHARINE Q. SEELYE
Published: January 9, 2013
It is not your imagination - more people you know are sick this winter, even people who have had flu shots.
The country is in the grip of three emerging flu or flulike epidemics: an early start to the annual flu season with an unusually aggressive virus, a surge in a new type of norovirus, and the worst whooping cough outbreak in 60 years. And these are all developing amid the normal winter highs for the many viruses that cause symptoms on the "colds and flu" spectrum.
Influenza is widespread, and causing local crises. On Wednesday, Boston's mayor declared a public health emergency as cases flooded hospital emergency rooms..[...]

The text contains two combinations of values that the rule condition would match: many viruses and aggressive virus. Both strings are composed of an adjective preceding the lemma virus.
Sequences are contained in these preposition phrases:

with an unusually aggressive virus

for the many viruses

and the SECTOR transformation causes the extraction of the same phrases.