SENTENCE
SENTENCE
is an extraction transformation option that can be described as a completion feature of the matched value rather than a normalization. It adds elements surrounding the original matched data to the final extracted value.
Its action is based on the concept of "sentence", a grammatical unit consisting of one or several words linked to each other by a syntactic relation in order to convey meaning. The recognition of sentences takes place during the disambiguation process.
The SENTENCE
option returns the whole sentence containing the value matched by an attribute.
The syntax of the SENTENCE
option is the following:
SCOPE scopeOption
{
IDENTIFY(templateName)
{
@field[attribute]|[SENTENCE]
}
}
This option is useful in situations where it's necessary to expand the extraction output revolving around a matched element. Consider the following example:
SCOPE SENTENCE
{
IDENTIFY(SUBJECT)
{
@Subject[TYPE(NPH) + ROLE(SUBJECT)]|[SENTENCE]
}
}
This purpose of this rule is to extract people's names (TYPE(NPH)
) only if the names identified are the subjects of a sentence or clause (+ ROLE(SUBJECT)
). If this condition is verified, the SENTENCE
transformation option will ensure that every extracted value will be expanded to the sentence where the people's names are found as subjects.
Consider the extraction output if the rule above is run against the following sample sentence:
Assistant Commissioner Simon Byrne described detectives as "constables in T-shirts and jeans" and said he wanted to end the division between uniformed officers and detectives.
The text contains one value matching the sample rule: Simon Byrne. This concept is recognized as a person's name and is the subject of the first clause of the sentence. The SENTENCE
transformation extracts the whole sentence in which the person's name was found: Assistant Commissioner Simon Byrne described detectives as "constables in T-shirts and jeans" and said he wanted to end the division between uniformed officers and detectives.