Attributes overview
Attributes are the building blocks of categorization and extraction rules.
They are operands which are used to match disambiguation output tokens based on the attributes they possess.
The syntax of a generic attribute is:
attributeName(value1[, value2, ...])
where:
attributeName
can be one of the possible attributes listed below.value#
refers to the parameter taken by the attribute. For each attribute, it is possible to specify more than one value.
This table lists all possible attributes along with their values and a short description. For further details, please see the individual attribute sections.
Attribute | Values type | Description |
---|---|---|
KEYWORD |
String | Matches any token that is exactly equal to the given strings |
LEMMA |
Lemma | Matches any token that is a possible inflection of a given lemma contained in the knowledge graph |
BLEMMA |
Lemma | Similar to LEMMA , but a match is performed at the sub-token or "atom" level of the text |
ULEMMA |
Lemma | Matches any token that is a possible inflection of a given lemma not contained in the knowledge graph |
SYNCON |
Syncon | Matches any token which corresponds to a given concept (syncon) contained in the knowledge graph |
ANCESTOR |
Syncon | Matches every token which corresponds to a "descendant concept" of a given concept contained in the knowledge graph |
LIST |
Syncon | Matches every token corresponding to any lemma of the given knowledge graph concepts |
BLIST |
Syncon | Similar to LIST , but matches are performed at the "atom level" of the text analysis output |
TYPE |
Type | Matches any token of the given types |
PATTERN |
Regular expression(s) | Matches any token matching the given regular expressions |
ROLE |
Role | Matches any token having one of the given roles in the sentence analysis of the text |
POSITION |
Position | Matches any token occupying one of the given positions |
RELEVANT |
List | Matches any token being in one of the given lists of relevant text elements |
TAG |
Tag | Matches any token corresponding to one of the given tags |
BTAG |
Tag | Similar to TAG , but a match is performed at the sub-token or "atom" level of the text |
CELL |
Integer | Query and extract cells content from tables of given coordinates defined with row and column |
TITLELEVEL |
Integer | Extract content related with a given heading level |
STEM |
String | Matches any token sharing the root of the given strings |
As an option, values can be loaded from an external list.