Skip to content

SCOPE standard options

Introduction

The standard scope options are those portions of a text that relate to the textual subdivisions generated during the semantic disambiguation process, and they are used to delimit the area of action of a rule or group of rules. There are four standard options listed below in order of extent (from the widest to the narrowest):

  • PARAGRAPH
  • SENTENCE
  • CLAUSE
  • PHRASE

PARAGRAPH scope

PARAGRAPH is the broadest among the standard textual subdivisions. It is a unit of discourse consisting of one or more sentences dealing with a particular concept. Its start is typically indicated by the beginning of a new line. By selecting this option, a rule will be applied to every paragraph that has been recognized in the entire input document.

The syntax for the scope option PARAGRAPH is the following:

SCOPE PARAGRAPH [ON ATOM]
{
    rule(s)
}

Note

Parts between square brackets ([]) are optional.

ON ATOM is optional and lets your rules trigger in function of an atom-based count of the textual elements of the sentence. You can find a practical example in the positional sequences section of this manual.

It is also possible to operate on multiple paragraphs using the following syntax:

SCOPE PARAGRAPH*n [ON ATOM].
{
    rule(s)
}

Where the asterisk (*) is a multiplier and n is a number indicating how many adjoining paragraphs the rule will act upon. For example:

SCOPE PARAGRAPH*2
{
    //rule(s)//
}

the rule will act upon text blocks, each made of two paragraphs: the first and the second paragraph of the document, the second and the third, the third and the fourth, and so on.

SENTENCE scope

SENTENCE is one of the standard textual subdivisions. It consists of one or several words, linked to each other by a syntactic relation, which are able to convey meaning. The beginning of a sentence usually follows a punctuation mark such as a period, question mark or exclamation mark. By selecting this option, a rule will act upon each sentence that has been recognized in the whole input document.

The syntax for the scope option SENTENCE is the following:

SCOPE SENTENCE [ON ATOM]
{
    rule(s)
}

It is also possible to operate on multiple sentences by using the following syntax:

SCOPE SENTENCE*n [ON ATOM].
{
    rule(s)
}

Where the asterisk (*) is a multiplier and n is a number indicating how many adjoining sentences the rule will act upon. For example:

SCOPE SENTENCE*3
{
    //rule(s)//
}

the rule would act upon text blocks made of three sentences: the first, the second and the third sentence of the text; the second, the third and the fourth and so on.

CLAUSE scope

CLAUSE is one of the standard textual subdivisions. It consists of one or several words within a sentence representing the smallest grammatical unit that can express a complete proposition. By selecting this option, the rule will be applied to every clause that has been recognized in the entire input document.

The syntax for the scope option CLAUSE is the following:

SCOPE CLAUSE [ON ATOM]
{
    rule(s)
}

The disambiguator is able to recognize the clauses in a sentence (if they exist), identify the types of clauses (independent or dependent) and also the types of dependent clauses.

The predefined names of such clause types can be optionally used as constraints to define a clause scope. The syntax is:

SCOPE CLAUSE(clauseType) [ON ATOM]
{
    rule(s)
}

The clause types are listed below: as GENERIC identifies a portion of text which is not a proper clause, the SCOPE CLAUSE syntax with clause_type GENERIC does not trigger any rule.

Type Abbreviation Description
INDEPENDENT IND Corresponds to the only clause of a simple sentence or to the main clause of a complex sentence containing several clauses
SUBORDINATE SUB Corresponds to any kind of dependent clause that adds information to an independent clause, but which cannot stand by itself as a sentence. For example all the clause types listed here excepted the INDEPENDENT
RELATIVE REL Corresponds to a kind of subordinate clause that begins with a relative pronoun and contains an element, whose interpretation is provided by an antecedent on which the subordinate clause is grammatically dependent
NON-FINITE NF Corresponds to a kind of subordinate clause containing a verb which does not show tense (non-finite)
PREPOSITIONAL PRP Corresponds to a kind of subordinate clause introduced by a preposition
CAUSAL CSL Corresponds to a kind of subordinate clause that states the reason or cause of the fact stated in the independent clause
TEMPORAL TMP Corresponds to a kind of subordinate clause that indicates an act or state that occurs prior to, at the same time as, or subsequent to the act or state of the main clause
COMPARATIVE CMP Corresponds to a kind of subordinate clause that follows the comparative form of an adjective or adverb
SUBJECT/OBJECT S/O Corresponds to a kind of subordinate noun clause which acts as the subject or the object of the preposition
NON-FINITE SUBJECT/OBJECT NSO Corresponds to a kind of subordinate noun clause containing a verb which does not show tense (non-finite)
CONDITIONAL SUBJECT/OBJECT CSO Corresponds to a kind of subordinate noun clause expressing factual implications or hypothetical situations and their consequences (introduced by if)
PURPOSE PRS Corresponds to a kind of adverbial clause expressing purpose (commonly referred to in linguistics as final clause)
CONSECUTIVE CNS Corresponds to a kind of subordinate clause which expresses the result of the action stated in the main clause or a preceding sentence
CONCESSIVE CNC Corresponds to a kind of subordinate clause which expresses an opposite idea compared to the main clause
ADVERSATIVE ADV Corresponds to a kind of subordinate clause which expresses an event or situation that is opposite to that of the main clause (introduced by but...)
GENERIC1 SUB Denotes portions of text that are not real clauses
PARENTHETIC INC Corresponds to a kind of clause, often explanatory or qualifying, inserted into a passage with which it is not grammatically connected, and marked off by brackets, dashes, etc.

It is possible to select a single clause type or a list of several types to be used as constraint. For example:

SCOPE CLAUSE(INDEPENDENT)
{
    //rule(s)//
}

is a rule which acts upon any text block recognized to be an independent clause within a sentence, whereas

SCOPE CLAUSE(TEMPORAL, COMPARATIVE)
{
    //rule(s)//
}

acts upon any text block recognized to be either a temporal or a comparative clause.
See also the combination of PHRASE and CLAUSE scope options.

PHRASE scope

PHRASE is the narrowest among the standard textual subdivisions. It consists of one or several words that function as a constituent of the sentence and act as single units in its syntax. Common examples of "phrases" are noun phrases, verb phrases, etc. By selecting this option, a rule will act upon every phrase that has been recognized in the whole input document.

The syntax for the scope option PHRASE is the following:

SCOPE PHRASE [ON ATOM]
{
    rule(s)
}

The disambiguator is able to recognize the phrases in a sentence and identify which types they are. The predefined names of such PHRASE types can be optionally used as constraints to define a PHRASE scope. The syntax is:

SCOPE PHRASE(phraseType) [ON ATOM]
{
    rule(s)
}

The phrase types are:

Phrase Type Phrase Type (Italian only) Description
AP GA Adjective Phrase
CP CN Conjunction Phrase
DP GV Adverb Phrase
NA NA Not Applicable (usually indicates punctuation)
NP GN Noun Phrase
PN PN Nominal Predicate
PP GP Preposition Phrase
RP GR Relative Phrase
VP PV Verb Phrase

It is possible to select a single PHRASE type or a list of them to be used as constraint. For example:

SCOPE PHRASE(NP)
{
    //rule(s)//
}

This rule acts upon any text block recognized to be a noun phrase, whereas this one:

SCOPE PHRASE(PP, NP)
{
    //rule(s)//
}

will act upon any text block recognized to be either a prepositional phrase or a noun phrase.

Finally, PHRASE types can be used in sequences of two or more elements, each separated by a slash sign (/). For example:

SCOPE PHRASE(NP/VP)
{
    //rule(s)//
}

will act upon any text block recognized to be a noun phrase only if followed by a verb phrase.

Note

In this case the rule will only act upon the NP phrase, because the VP phrase is a further scope restriction on the rule.

More complex combinations such as:

SCOPE PHRASE(AP, PP, NP/VP)
{
    //rule(s)//
}

are also valid. Such a scope definition will act upon any text block recognized to be an adjective phrase or a prepositional phrase or a noun phrase followed by a verb phrase.

Combination of PHRASE and CLAUSE scope options

PHRASE and CLAUSE options can be combined when setting a rule's scope. In particular, it is possible to select a phrase included within a specific clause scope. In other words, selecting this option, a rule is required to act upon a phrase that has been recognized within a given clause.

The syntax for combining the PHRASE and CLAUSE scope options is the following:

SCOPE PHRASE IN CLAUSE(clauseType)
{
    rule(s)
}

where clause_type corresponds to one of the types available for the CLAUSE option.

It is also possible to select one or more of the available phrase types to furtherly restrict the scope definition.

SCOPE PHRASE(phraseType) IN CLAUSE(clauseType)
{
    rule(s)
}

For example:

SCOPE PHRASE(NP) IN CLAUSE(INDEPENDENT)
{
    //rule(s)//
}

will act upon a rule on any text block recognized to be a noun phrase within an independent clause.

The use of such combinations aims to identify a very precise and limited area for the rule to act upon; in fact, the hits generated by the rules with this kind of scope are more likely to be characterized by high precision rather than high recall.


  1. As GENERIC identifies a portion of text which is not properly a clause, the scope clause syntax with clause_type GENERIC does not trigger any rule.