Skip to content

ROLE attribute

Basic syntax

The ROLE attribute identifies a token by specifying the syntactic role it takes in a clause. The token will be recognized in a text, if it matches the specified role.

The syntax is:

ROLE(role1[, role2, ...])


  • ROLE is the attribute name and must be written in uppercase.
  • role# refers to one of the basic syntactic units commonly recognized during the logical analysis of sentences or clauses: subject, object, verb etc.

During the disambiguation process, after word classes (noun, pronoun, adjective, adverb, etc.) have been recognized and assigned to each and every element in the clause/sentence, words are then analyzed from a different point of view. The disambiguator groups them and analyzes the syntactic function that each group performs in the clause/sentence with the aim of recognizing and assigning a role to each group of words.

A rule using the ROLE attribute will be valid, only if the role is specified in a predefined format. All available roles, with a brief description for each one of them, are shown in the table below.

Role Description
OBJECT Direct object
NOMINAL_P Predicate nominal
VERBAL_P Verbal predicate
INDIRECT Indirect object
OTHER Other complement


Please note: the ROLE attribute, if used alone, is hyper generative. It is highly recommended to use the ROLE attribute in conjunction with other attributes.

Please note that all but the last of the roles listed above are commonly recognized classes in English, Italian, Spanish and German grammar. The role OTHER has a special status as described in the next section.

The ROLE attribute allows the use of one or more roles in a given statement. A token will be identified in a text, if, during the disambiguation process, the token is associated with the specified role.

For example:


This statement would identify any element acting as a subject in a sentence or a clause.

For demonstrative purposes, imagine that the statement above is used alone in rule-writing. In a sentence such as:

The President offered no specific proposals.

the text recognized as the subject is The President. This group of words acting as a subject is made up of one main clause that expresses its action in an active voice.

The same statement defined above would also recognize subjects within passive voice clauses. Consider the following sentence:

All victims were wounded by officers.

The element that would be recognized as a subject here is by officers. This is the object of a passive voice but could be "promoted" to the role of real syntactic subject of the clause, if this is turned into the active voice. The concept underlying the clause All victims were wounded by officers can also be expressed in the following way: The officers wounded all the victims. The victims are not those who performed the action (normally the role attributed to the subject), instead they "received" the action performed by officers.

OTHER role

While using the ROLE attribute, it is possible to specify not only the "standard" roles corresponding to recognized classes in the English, Italian, Spanish and German grammar, but also the role named OTHER.

During the disambiguation process, if it is not possible to accurately interpret the correct syntactical roles of all elements in a sentence or none of the available roles are suitable, the role OTHER will be assigned by default. Thus, it is possible to use the logical structure of the text to develop both categorization or extraction rules.