Attributes overview

Attributes are the building blocks of categorization and extraction rules.

They are operands which are used to match disambiguation output tokens based on the attributes they possess.

The syntax of a generic attribute is:

attributeName(value1[, value2, ...])

where:

attributeName can be one of the possible attributes listed below.
value# refers to the parameter taken by the attribute. For each attribute, it is possible to specify more than one value.

This table lists all possible attributes along with their values and a short description. For further details, please see the individual attribute sections.

Attribute	Values type	Description
`KEYWORD`	String	Matches any token that is exactly equal to the given strings
`LEMMA`	Lemma	Matches any token that is a possible inflection of a given lemma contained in the knowledge graph
`BLEMMA`	Lemma	Similar to `LEMMA`, but a match is performed at the sub-token or "atom" level of the text
`ULEMMA`	Lemma	Matches any token that is a possible inflection of a given lemma not contained in the knowledge graph
`SYNCON`	Syncon	Matches any token which corresponds to a given concept (syncon) contained in the knowledge graph
`ANCESTOR`	Syncon	Matches every token which corresponds to a "descendant concept" of a given concept contained in the knowledge graph
`LIST`	Syncon	Matches every token corresponding to any lemma of the given knowledge graph concepts
`BLIST`	Syncon	Similar to `LIST`, but matches are performed at the "atom level" of the text analysis output
`TYPE`	Type	Matches any token of the given types
`PATTERN`	Regular expression(s)	Matches any token matching the given regular expressions
`ROLE`	Role	Matches any token having one of the given roles in the sentence analysis of the text
`POSITION`	Position	Matches any token occupying one of the given positions
`RELEVANT`	List	Matches any token being in one of the given lists of relevant text elements
`TAG`	Tag	Matches any token corresponding to one of the given tags
`BTAG`	Tag	Similar to `TAG`, but a match is performed at the sub-token or "atom" level of the text
`CELL`	Integer	Query and extract cells content from tables of given coordinates defined with row and column
`TITLELEVEL`	Integer	Extract content related with a given heading level
`STEM`	String	Matches any token sharing the root of the given strings

As an option, values can be loaded from an external list.