Skip to content

Categorization rules syntax

The typical syntax of a categorization rule is:

DOMAIN[[ruleLabel]](domain[:scoreOption])
{
    condition
}

where:

  • DOMAIN is a language keyword and must be written in uppercase.
  • ruleLabel is a label that helps identify the rule.
  • domain is the name of one of the domains in the taxonomy.
  • scoreOption is the name of one of the available score options that determines the amount of "points" that are added to the overall domain score every time the rule is triggered. The default is NORMAL.
  • condition is the rule's condition.

Note

Parts between brackets ([...]) are optional.

Rules of this kind, when they activate, add points to the score of the domain specified in the header.

Every rule must be contained in a scope specifier:

SCOPE scopeOption
{
    DOMAIN[[ruleLabel]](domain[:scoreOption])
    {
        condition
    }
}

For example, the rule:

SCOPE SENTENCE
{
    DOMAIN(dom1:NORMAL)
    {
        LEMMA("abandon")
        AND
        LEMMA("oil well")
    }
}

is triggered when lemma abandon and lemma oil well are found in the same sentence. When triggered, it generates a NORMAL amount of points and adds them to the cumulative score of the dom1 domain.

It is also possible to define categorization rules that affect the score of more domains at once using this syntax:

DOMAIN[[ruleLabel]](domain#1[:scoreOption, domain#2:scoreOption,...])
{
    condition
}

For example, the rule:

SCOPE SENTENCE
{
    DOMAIN(dom1:NORMAL, dom2:LOW, dom3:HIGH)
    {
        LEMMA("abandon")
        AND
        LEMMA("oil well")
    }
}

affects the score of the domain dom1 with the NORMAL score option, the domain dom2 with the LOW score option and the domain dom3 with the HIGH score option.

More rules can be put inside the same scope specifier:

SCOPE scopeOption
{
    //Rule #1
    DOMAIN[[ruleLabel]](domain[:scoreOption]])
    {
        condition
    }

    //Rule #2
    DOMAIN[[ruleLabe]](domain[:scoreOption])
    {
        condition
    }

    ...
}