POSITION attribute

The POSITION attribute identifies a token by specifying its position in a text. The token will be recognized in a text, if it is found in the specified position.

The syntax is:

POSITION(position1[, position2, ...])

where:

POSITION is the attribute name and must be written in uppercase.
position# refers to a list of predefined values identifying key positions for textual elements inside the document itself. These textual elements include any sequence of alphabetical characters, numbers and punctuation marks.

A rule using the POSITION attribute will be valid, only if the position is specified in a predefined format. All positions, with a brief description for each one of them, are listed in the table below.

Position	Description
BEGIN SENTENCE	First token in a sentence
END SENTENCE	Last token in a sentence
BEGIN PARAGRAPH	First token in a paragraph
END PARAGRAPH	Last token in a paragraph
BEGIN SECTION	First token in a document
END SECTION	Last token in a document

Warning

Please note: the POSITION attribute, if used alone, is hyper generative. It is highly recommended to use the POSITION attribute in conjunction with other attributes.

The POSITION attribute allows the use of one or more positions in a given statement. A token will be identified in a text, if it is found in the specified position.

For example:

POSITION(BEGIN SENTENCE)

This statement would identify any element found at the beginning of a sentence.

For demonstrative purposes, let's imagine the statement above is used alone in rule-writing. In a sentence such as:

Investigators said it could take months to create a full account of the events preceding and during the killing rampage. The State Police officially confirmed the identity of the killer.

The elements that are recognized as the beginning of the sentence would be Investigators and The.