NEXT operator
NEXT
is an operator that combines the Boolean operator AND
with the flexible positional sequence operator (<>
).
The syntax is:
operand1
NEXT
operand2
where NEXT
is a language keyword and must be written in uppercase.
The operator requires two input text tokens to match the combined operands in the order in which they are declared in the rule. For the rule to trigger, the two elements must not occur in the same sentence, but in two sentences—or more—according to the rule scope.
For example, consider this extraction rule:
SCOPE SENTENCE*2
{
IDENTIFY(TEST)
{
@NAME[TYPE(NPH)]
NEXT
@PHONE_NUMBER[TYPE(PHO)]
}
}
The rule's condition matches a person's name (TYPE(NPH)
) followed by a phone number (TYPE(PHO)
) in the scope of at least two consecutive sentences.
If the rule is run against this text:
Personal Data
------------
Name: Tom Smith
Phone number: +1 123 456 7890
Date of birth: 21/10/1967
the condition is met by Tom Smith and +1 123 456 7890, which occur in consecutive sentences, so the rule is triggered.
Tip
In order to alternatively combine the NEXT
operator functionality with the OPTIONAL
operator functionality, consider this workaround.
For example, suppose that you want to extract people's names and—eventually—their addresses, if they occur in the text.
Consider this template:
TEMPLATE(PERSONAL_DATA)
{
@Name,
@Address
}
The following rule will generate a syntax error:
IDENTIFY(PERSONAL_DATA)
{
@Name[TYPE(NPH)]
NEXT
OPTIONAL
{
@Address[TYPE(ADR)]
}
}
if applied to this input text:
John is 20. He lives in Naples.
but if you change the rule like this:
IDENTIFY(PERSONAL_DATA)
{
@Name[TYPE(NPH)]
NEXT
(
@Address[TYPE(ADR)]
OR
POSITION(BEGIN SENTENCE)
)
}
you will get this record:
Template: PERSONAL_DATA
Field | Value |
---|---|
@Name | John |
If, on the other hand, the changed rule is applied to this input text:
John is 20. He lives in Baltimora Street.
you will get this record:
Template: PERSONAL_DATA
Field | Value |
---|---|
@Name | John |
@Address | Baltimora Street |
The rule above allows you extract the address if it occurs in a text or it will extract nothing more than the Name field, thanks to the POSITION
attribute workaround.