POSITION attribute identifies a token by specifying its position in a text. The token will be recognized in a text, if it is found in the specified position.
The syntax is:
POSITION(position1[, position2, ...])
POSITIONis the attribute name and must be written in uppercase.
position#refers to a list of predefined values identifying key positions for textual elements inside the document itself. These textual elements include any sequence of alphabetical characters, numbers and punctuation marks.
A rule using the
POSITION attribute will be valid, only if the position is specified in a predefined format. All positions, with a brief description for each one of them, are listed in the table below.
|BEGIN SENTENCE||First token in a sentence|
|END SENTENCE||Last token in a sentence|
|BEGIN PARAGRAPH||First token in a paragraph|
|END PARAGRAPH||Last token in a paragraph|
|BEGIN SECTION||First token in a document|
|END SECTION||Last token in a document|
Please note: the
POSITION attribute, if used alone, is hyper generative. It is highly recommended to use the
POSITION attribute in conjunction with other attributes.
POSITION attribute allows the use of one or more positions in a given statement. A token will be identified in a text, if it is found in the specified position.
This statement would identify any element found at the beginning of a sentence.
For demonstrative purposes, let's imagine the statement above is used alone in rule-writing. In a sentence such as:
Investigators said it could take months to create a full account of the events preceding and during the killing rampage. The State Police officially confirmed the identity of the killer.
The elements that are recognized as the beginning of the sentence would be Investigators and The.