Skip to content

Intersection of scope options

SECTION and SEGMENT scope options can be used in advanced combinations that imply the intersection of a section and one or more segments, or the intersection of two or more segments.

The syntax for enabling the first kind of combination is the following:

SCOPE SECTION(sectionName:segmentName) [ON ATOM]
{
    rule(s)
}

Note

Parts between square brackets ([]) are optional.

ON ATOM is optional and lets your rules trigger in function of an atom-based count of the textual elements of the sentence. You can find a practical example in the positional sequences section of this manual.

When the SECTION scope option is used by itself, the input documents must contain SECTION annotations. For example, if we consider a newspaper article, where the TITLE, LEAD and BODY of the text are annotated as SECTION, and a segment named BYLINE is created to intercept information about the article's author, the following rule:

SCOPE SECTION(BODY:BYLINE)
{
    //rule(s)//
}

will act upon a text block contained in a particular "region" of the document resulting from the intersection of the section BODY and the segment BYLINE, thus allowing very specific matches on selected portions of a text.

The syntax for enabling the second kind of combination is the following:

SCOPE SEGMENT(segmentName:segmentName) [ON ATOM]
{
    rule(s)
}

Scope intersection is also possible among two or more segments. For example, if we consider a letter or e-mail, where it is possible to recognize text subdivision containing the SENDER_NAME and the RECEIVER_NAME, and other parts containing ADDRESSES; if these are contained in as many SEGMENTS, the following code:

SCOPE SEGMENT(SENDER:ADDRESSES, RECEIVER:ADDRESSES)
{
    //rule(s)//
}

will act upon the rule on text blocks contained in particular "regions" of the document resulting from the intersection of the segment SENDER with the segment ADDRESSES, and the segment RECEIVER with the segment ADDRESSES, thus recognizing, by proximity, the addresses respectively belonging to the sender and the receiver of the letter.