SCOPE with SENTENCE RELEVANT syntax
Introduction
During the disambiguation process, the disambiguator is able to identify the most relevant elements in a document: keywords, lemmas, syncons, domains, and sentences. This information can be used in a variety of ways to enhance the accuracy of categorization and extraction rules.
In particular, the three most relevant sentences identified in each document can become a valid option or constraint for the SCOPE
definition. In fact, the SENTENCE RELEVANT
syntax can be used either alone as a proper SCOPE
option, or in combination with all types of standard and custom scope options.
SENTENCE RELEVANT scope option
The syntax to use a proper SCOPE
option is the following:
SCOPE SENTENCE RELEVANT
{
rule(s)
}
Any rule with such a scope definition will be activated only if the element(s) specified in the rule itself are found within at least one of the three relevant sentences of a document.
The relevant sentences identified in a document are also assigned a percentage, which indicates their relevance compared to the rest of the sentences in the document. This value can also be used to furtherly restrict the SCOPE
definition.
SCOPE SENTENCE RELEVANT:threshold
{
rules
}
where threshold
corresponds to either an integer or a decimal indicating the percent threshold to be considered. For example, using the following scope definition:
SCOPE SENTENCE RELEVANT:10%
{
//rule(s)//
}
the rule is restricted only to those cases when the element(s) specified in the rule itself are found within one of the three relevant sentences of a document having a relevance value of at least 10%.
SENTENCE RELEVANT scope constraint
The SENTENCE RELEVANT
syntax can also be used in combination with other scope options to furtherly restrict the scope itself. Two cases are possible:
- The scope option defines a portion of text which is smaller than a sentence; therefore the relevant sentence contains the selected scope, which can be either
PHRASE
orCLAUSE
. - The scope option defines a portion of text which is larger than a sentence; therefore the relevant sentence is contained in the selected scope, which can be
PARAGRAPH
,SEGMENT
orSECTION
.
Each of the two cases uses the SENTENCE RELEVANT
syntax in a different way. When the scope is smaller, the syntax is implemented as follows:
SCOPE scopeOption IN SENTENCE RELEVANT
{
rule(s)
}
When the scope is larger, the syntax is implemented as follows:
SCOPE scopeOption WITH SENTENCE RELEVANT
{
rule(s)
}
where scopeOption
refers to the options mentioned above for each case.
The relevant sentences identified in a document are also assigned percentage, which indicates their relevance compared to the rest of the sentences in the document. This value can be used to furtherly restrict the SCOPE
definition.
The syntax is the following:
SCOPE scopeOption IN SENTENCE RELEVANT:threshold
{
rule(s)
}
SCOPE scopeOption WITH SENTENCE RELEVANT:threshold
{
rules
}
where threshold
corresponds to either an integer or a decimal indicating the percent threshold to be considered.
A few examples for the different cases are listed below:
-
Smaller scope option:
SCOPE PHRASE IN SENTENCE RELEVANT SCOPE CLAUSE IN SENTENCE RELEVANT:17% SCOPE PHRASE (NP) IN SENTENCE RELEVANT:10% SCOPE CLAUSE (INDEPENDENT) IN SENTENCE RELEVANT SCOPE PHRASE IN CLAUSE(SUBORDINATE) IN SENTENCE RELEVANT:22% SCOPE PHRASE (NP, NP/PP) IN CLAUSE (INDEPENDENT) IN SENTENCE RELEVANT:10%
-
Larger scope option:
SCOPE PARAGRAPH WITH SENTENCE RELEVANT:2% SCOPE SECTION(TITLE, BODY) WITH SENTENCE RELEVANT SCOPE SEGMENT(SENDER) WITH SENTENCE RELEVANT:10%