Intersection of scope options
SECTION
and SEGMENT
scope options can be used in advanced combinations that imply the intersection of a section and one or more segments, or the intersection of two or more segments.
The syntax for enabling the first kind of combination is the following:
SCOPE SECTION(sectionName:segmentName) [ON ATOM]
{
rule(s)
}
Note
Parts between square brackets ([]
) are optional.
ON ATOM
is optional and lets your rules trigger in function of an atom-based count of the textual elements of the sentence. You can find a practical example in the positional sequences section of this manual.
When the SECTION
scope option is used by itself, the input documents must contain SECTION
annotations. For example, if we consider a newspaper article, where the TITLE, LEAD and BODY of the text are annotated as SECTION
, and a segment named BYLINE
is created to intercept information about the article's author, the following rule:
SCOPE SECTION(BODY:BYLINE)
{
//rule(s)//
}
will act upon a text block contained in a particular "region" of the document resulting from the intersection of the section BODY
and the segment BYLINE
, thus allowing very specific matches on selected portions of a text.
The syntax for enabling the second kind of combination is the following:
SCOPE SEGMENT(segmentName:segmentName) [ON ATOM]
{
rule(s)
}
Scope intersection is also possible among two or more segments. For example, if we consider a letter or e-mail, where it is possible to recognize text subdivision containing the SENDER_NAME
and the RECEIVER_NAME
, and other parts containing ADDRESSES
; if these are contained in as many SEGMENTS
, the following code:
SCOPE SEGMENT(SENDER:ADDRESSES, RECEIVER:ADDRESSES)
{
//rule(s)//
}
will act upon the rule on text blocks contained in particular "regions" of the document resulting from the intersection of the segment SENDER
with the segment ADDRESSES
, and the segment RECEIVER
with the segment ADDRESSES
, thus recognizing, by proximity, the addresses respectively belonging to the sender and the receiver of the letter.