SCOPE standard options
Introduction
The standard scope options are those portions of a text that relate to the textual subdivisions generated during the semantic disambiguation process, and they are used to delimit the area of action of a rule or group of rules. There are four standard options listed below in order of extent (from the widest to the narrowest):
PARAGRAPH
SENTENCE
CLAUSE
PHRASE
PARAGRAPH scope
PARAGRAPH
is the broadest among the standard textual subdivisions. It is a unit of discourse consisting of one or more sentences dealing with a particular concept. Its start is typically indicated by the beginning of a new line. By selecting this option, a rule will be applied to every paragraph that has been recognized in the entire input document.
The syntax for the scope option PARAGRAPH
is the following:
SCOPE PARAGRAPH [ON ATOM]
{
rule(s)
}
Note
Parts between square brackets ([]
) are optional.
ON ATOM
is optional and lets your rules trigger in function of an atom-based count of the textual elements of the sentence. You can find a practical example in the positional sequences section of this manual.
It is also possible to operate on multiple paragraphs using the following syntax:
SCOPE PARAGRAPH*n [ON ATOM].
{
rule(s)
}
Where the asterisk (*
) is a multiplier and n
is a number indicating how many adjoining paragraphs the rule will act upon. For example:
SCOPE PARAGRAPH*2
{
//rule(s)//
}
the rule will act upon text blocks, each made of two paragraphs: the first and the second paragraph of the document, the second and the third, the third and the fourth, and so on.
SENTENCE scope
SENTENCE
is one of the standard textual subdivisions. It consists of one or several words, linked to each other by a syntactic relation, which are able to convey meaning. The beginning of a sentence usually follows a punctuation mark such as a period, question mark or exclamation mark. By selecting this option, a rule will act upon each sentence that has been recognized in the whole input document.
The syntax for the scope option SENTENCE
is the following:
SCOPE SENTENCE [ON ATOM]
{
rule(s)
}
It is also possible to operate on multiple sentences by using the following syntax:
SCOPE SENTENCE*n [ON ATOM].
{
rule(s)
}
Where the asterisk (*
) is a multiplier and n
is a number indicating how many adjoining sentences the rule will act upon. For example:
SCOPE SENTENCE*3
{
//rule(s)//
}
the rule would act upon text blocks made of three sentences: the first, the second and the third sentence of the text; the second, the third and the fourth and so on.
CLAUSE scope
CLAUSE
is one of the standard textual subdivisions. It consists of one or several words within a sentence representing the smallest grammatical unit that can express a complete proposition. By selecting this option, the rule will be applied to every clause that has been recognized in the entire input document.
The syntax for the scope option CLAUSE
is the following:
SCOPE CLAUSE [ON ATOM]
{
rule(s)
}
The disambiguator is able to recognize the clauses in a sentence (if they exist), identify the types of clauses (independent or dependent) and also the types of dependent clauses.
The predefined names of such clause types can be optionally used as constraints to define a clause scope. The syntax is:
SCOPE CLAUSE(clauseType) [ON ATOM]
{
rule(s)
}
The clause types are listed below: as GENERIC
identifies a portion of text which is not a proper clause, the SCOPE CLAUSE
syntax with clause_type GENERIC
does not trigger any rule.
Type | Abbreviation | Description |
---|---|---|
INDEPENDENT |
IND |
Corresponds to the only clause of a simple sentence or to the main clause of a complex sentence containing several clauses |
SUBORDINATE |
SUB |
Corresponds to any kind of dependent clause that adds information to an independent clause, but which cannot stand by itself as a sentence. For example all the clause types listed here excepted the INDEPENDENT |
RELATIVE |
REL |
Corresponds to a kind of subordinate clause that begins with a relative pronoun and contains an element, whose interpretation is provided by an antecedent on which the subordinate clause is grammatically dependent |
NON-FINITE |
NF |
Corresponds to a kind of subordinate clause containing a verb which does not show tense (non-finite) |
PREPOSITIONAL |
PRP |
Corresponds to a kind of subordinate clause introduced by a preposition |
CAUSAL |
CSL |
Corresponds to a kind of subordinate clause that states the reason or cause of the fact stated in the independent clause |
TEMPORAL |
TMP |
Corresponds to a kind of subordinate clause that indicates an act or state that occurs prior to, at the same time as, or subsequent to the act or state of the main clause |
COMPARATIVE |
CMP |
Corresponds to a kind of subordinate clause that follows the comparative form of an adjective or adverb |
SUBJECT /OBJECT |
S/O | Corresponds to a kind of subordinate noun clause which acts as the subject or the object of the preposition |
NON-FINITE SUBJECT /OBJECT |
NSO |
Corresponds to a kind of subordinate noun clause containing a verb which does not show tense (non-finite) |
CONDITIONAL SUBJECT /OBJECT |
CSO |
Corresponds to a kind of subordinate noun clause expressing factual implications or hypothetical situations and their consequences (introduced by if) |
PURPOSE |
PRS |
Corresponds to a kind of adverbial clause expressing purpose (commonly referred to in linguistics as final clause) |
CONSECUTIVE |
CNS |
Corresponds to a kind of subordinate clause which expresses the result of the action stated in the main clause or a preceding sentence |
CONCESSIVE |
CNC |
Corresponds to a kind of subordinate clause which expresses an opposite idea compared to the main clause |
ADVERSATIVE |
ADV |
Corresponds to a kind of subordinate clause which expresses an event or situation that is opposite to that of the main clause (introduced by but...) |
GENERIC 1 |
SUB |
Denotes portions of text that are not real clauses |
PARENTHETIC |
INC |
Corresponds to a kind of clause, often explanatory or qualifying, inserted into a passage with which it is not grammatically connected, and marked off by brackets, dashes, etc. |
It is possible to select a single clause type or a list of several types to be used as constraint. For example:
SCOPE CLAUSE(INDEPENDENT)
{
//rule(s)//
}
is a rule which acts upon any text block recognized to be an independent clause within a sentence, whereas
SCOPE CLAUSE(TEMPORAL, COMPARATIVE)
{
//rule(s)//
}
acts upon any text block recognized to be either a temporal or a comparative clause.
See also the combination of PHRASE
and CLAUSE
scope options.
PHRASE scope
PHRASE
is the narrowest among the standard textual subdivisions. It consists of one or several words that function as a constituent of the sentence and act as single units in its syntax. Common examples of "phrases" are noun phrases, verb phrases, etc. By selecting this option, a rule will act upon every phrase that has been recognized in the whole input document.
The syntax for the scope option PHRASE
is the following:
SCOPE PHRASE [ON ATOM]
{
rule(s)
}
The disambiguator is able to recognize the phrases in a sentence and identify which types they are. The predefined names of such PHRASE
types can be optionally used as constraints to define a PHRASE
scope. The syntax is:
SCOPE PHRASE(phraseType) [ON ATOM]
{
rule(s)
}
The phrase types are:
Phrase Type | Phrase Type (Italian only) | Description |
---|---|---|
AP |
GA |
Adjective Phrase |
CP |
CN |
Conjunction Phrase |
DP |
GV |
Adverb Phrase |
NA |
NA |
Not Applicable (usually indicates punctuation) |
NP |
GN |
Noun Phrase |
PN |
PN |
Nominal Predicate |
PP |
GP |
Preposition Phrase |
RP |
GR |
Relative Phrase |
VP |
PV |
Verb Phrase |
It is possible to select a single PHRASE type or a list of them to be used as constraint. For example:
SCOPE PHRASE(NP)
{
//rule(s)//
}
This rule acts upon any text block recognized to be a noun phrase, whereas this one:
SCOPE PHRASE(PP, NP)
{
//rule(s)//
}
will act upon any text block recognized to be either a prepositional phrase or a noun phrase.
Finally, PHRASE
types can be used in sequences of two or more elements, each separated by a slash sign (/
). For example:
SCOPE PHRASE(NP/VP)
{
//rule(s)//
}
will act upon any text block recognized to be a noun phrase only if followed by a verb phrase.
Note
In this case the rule will only act upon the NP
phrase, because the VP
phrase is a further scope restriction on the rule.
More complex combinations such as:
SCOPE PHRASE(AP, PP, NP/VP)
{
//rule(s)//
}
are also valid. Such a scope definition will act upon any text block recognized to be an adjective phrase or a prepositional phrase or a noun phrase followed by a verb phrase.
Combination of PHRASE and CLAUSE scope options
PHRASE
and CLAUSE
options can be combined when setting a rule's scope. In particular, it is possible to select a phrase included within a specific clause scope. In other words, selecting this option, a rule is required to act upon a phrase that has been recognized within a given clause.
The syntax for combining the PHRASE
and CLAUSE
scope options is the following:
SCOPE PHRASE IN CLAUSE(clauseType)
{
rule(s)
}
where clause_type
corresponds to one of the types available for the CLAUSE
option.
It is also possible to select one or more of the available phrase types to furtherly restrict the scope definition.
SCOPE PHRASE(phraseType) IN CLAUSE(clauseType)
{
rule(s)
}
For example:
SCOPE PHRASE(NP) IN CLAUSE(INDEPENDENT)
{
//rule(s)//
}
will act upon a rule on any text block recognized to be a noun phrase within an independent clause.
The use of such combinations aims to identify a very precise and limited area for the rule to act upon; in fact, the hits generated by the rules with this kind of scope are more likely to be characterized by high precision rather than high recall.
-
As
GENERIC
identifies a portion of text which is not properly a clause, the scope clause syntax withclause_type
GENERIC
does not trigger any rule. ↩