Skip to content

Transformation overview

By default, when an extraction rule is activated, each field is set with the text of the token matched by the operand it is associated with. With an optional transformation, instead, the field is set to data which still has to do with the matched token, but differs because it can be chosen from the disambiguation output or from the Knowledge Graph.

The syntax is:

@fieldName[operand]|[transformation]

Possible transformations are:

Transformation names are language keywords and must be written in uppercase.
The transformation name must be typed in brackets and put at the end of the field-prefixed operand preceded by a pipe character (vertical bar, |).

Each transformation is described in a dedicated topic.

Note

When using a transformation, the beginning and the end positions in the extraction output are those of the token matched by the operand, not those of the transformation itself.
If, for example, the transformation is SENTENCE, the field will contain the whole sentence in which the token was found, but the beginning and end positions will be those of the original token, which can be a single word.
If it is necessary to know the position of the whole extracted text (for example to highlight the extraction in the document text), this information could be found in the lower levels of the detailed extraction output, at the "word" level.