NEXT operator
A combination of two expressions made with the NEXT
operator is true if the expression before the operator is matched in a sentence and the other expression is matched in any subsequent sentence within the scope of the rule.
The NEXT
operator then requires the scope of the rule to be at least two consecutive sentences, such as SENTENCE*#
with #
> 1 or PARAGRAPH
1, the rule is not complied if a different scope is used.
NEXT
can be used to extend the reach of positional sequences, which have a single-sentence scope.
As an example of the use of NEXT
, consider this:
TEMPLATE(PERSONAL_DATA)
{
@NAME,
@PHONE_NUMBER
}
...
SCOPE SENTENCE*3
{
IDENTIFY(PERSONAL_DATA)
{
@NAME[TYPE(NPH)]
NEXT
@PHONE_NUMBER[TYPE(PHO)]
}
}
The rule's condition matches a person's name (TYPE(NPH)
) in a sentence and a phone number (TYPE(PHO)
) in any subsequent sentence within the rule's scope (three sentences).
If the rule is run against this text:
Personal Data
------------
Name: Tom Smith
Phone number: 123 456 7890
Date of birth: 10/21/1976
the condition is satisfied by Tom Smith and 123 456 7890, which occur in consecutive sentences, so the rule is triggered and this record is extracted:
Template: PERSONAL_DATA
Field | Value |
---|---|
@NAME | Tom Smith |
@PHONE_NUMBER | 1234567890 |
The syntax is:
expression
NEXT
expression
where NEXT
is a language keyword and must be written in uppercase.
The NEXT
operator cannot be combined with OPTIONAL
.
Tip
To work around the aforementioned limitation, you can use the POSITION
attribute as exemplified below.
For example, suppose that you want to extract people's names and, optionally, their addresses, if they occur in consecutive sentences.
Consider this template:
TEMPLATE(PERSONAL_DATA)
{
@NAME,
@ADDRESS
}
The following rule will not compile:
SCOPE SENTENCE*2
{
IDENTIFY(PERSONAL_DATA)
{
@NAME[TYPE(NPH)]
NEXT
OPTIONAL
{
@ADDRESS[TYPE(ADR)]
}
}
}
but if you change the rule like this:
IDENTIFY(PERSONAL_DATA)
{
@NAME[TYPE(NPH)]
NEXT
(
@ADDRESS[TYPE(ADR)]
OR
POSITION(BEGIN SENTENCE)
)
}
and you apply it to this text:
John is 20. He lives in Naples.
it will extract this record:
Template: PERSONAL_DATA
Field | Value |
---|---|
@Name | John |
while if applied to this text:
John is 20. He lives in Baltimora Street.
you will get this record:
Template: PERSONAL_DATA
Field | Value |
---|---|
@Name | John |
@Address | Baltimora Street |
-
If a paragraph consists of only one sentence, the combination will be false. ↩