POSITION attribute
The POSITION
attribute identifies a token by specifying its position in a text. The token will be recognized in a text, if it is found in the specified position.
The syntax is:
POSITION(position1[, position2, ...])
where:
POSITION
is the attribute name and must be written in uppercase.position#
refers to a list of predefined values identifying key positions for textual elements inside the document itself. These textual elements include any sequence of alphabetical characters, numbers and punctuation marks.
A rule using the POSITION
attribute will be valid, only if the position is specified in a predefined format. All positions, with a brief description for each one of them, are listed in the table below.
Position | Description |
---|---|
BEGIN SENTENCE | First token in a sentence |
END SENTENCE | Last token in a sentence |
BEGIN PARAGRAPH | First token in a paragraph |
END PARAGRAPH | Last token in a paragraph |
BEGIN SECTION | First token in a document |
END SECTION | Last token in a document |
Warning
Please note: the POSITION
attribute, if used alone, is hyper generative. It is highly recommended to use the POSITION
attribute in conjunction with other attributes.
The POSITION
attribute allows the use of one or more positions in a given statement. A token will be identified in a text, if it is found in the specified position.
For example:
POSITION(BEGIN SENTENCE)
This statement would identify any element found at the beginning of a sentence.
For demonstrative purposes, let's imagine the statement above is used alone in rule-writing. In a sentence such as:
Investigators said it could take months to create a full account of the events preceding and during the killing rampage. The State Police officially confirmed the identity of the killer.
The elements that are recognized as the beginning of the sentence would be Investigators and The.