Negations in sequences

Overview

The negation operator (exclamation mark, !) can be placed before an operand in a positional sequence to negate the operand, that is to indicate that something must not be present in that position of the sequence.

All types of sequences can act both at the atom or token level of a sentence, according to the attribute after them.

The syntax is:

!operand

It is possible to negate more than one operand in a positional sequence, but at least one operand must be "positive", i.e. not negated.

Consider the following rule:

SCOPE SENTENCE
{
    DOMAIN(dom1:NORMAL)
    {
        LEMMA("arrest")
        >>
        !LEMMA("record")
    }
}

The rule's condition is met by the lemma arrest not strictly followed by the lemma record.

If the rule is run against the following text:

An officer who resigned from Weslaco Police Department amid drunken-driving allegations earlier this month was arrested after refusing a Breathalyzer test, according to arrest records obtained by The Monitor.

lemma arrest is found two times. but the second time it is followed by lemma record so the rule is triggered only the first time.

The previous example shows how to negate the last operand in a sequence. It is also possible to negate the first operand and to negate several operands, as described below.

Left reference principle

By default, positive operands "point to" all the negative operands between them and the next positive operand, if any, in the sequence.
Also, at at the same time, each positive operand "points to" the next positive operand.

In other words, a positive operand is the left reference of any negative operand following it in the sequence and also of the next positive operand, if any.
Negative operands, on their hand, are the left reference of the first positive operand following them.

So, if there are negative operands between positive operands, it is as if they are transparent when it comes to the relationship of a positive operand with the next.

For example, in the following sequence:

Positive operand 1
Sequence operator 1
Negative operand
Sequence operator 2
Positive operand 2

operator 1 determines the combination between the first positive operand and the negative operand and operator 2 determines the combination between the first positive operand and the second positive operand.

In the absence of preceding positive operands, negative operands "point to" the next positive operand. For example, in the following sequence:

Negative operand 1
Sequence operator 1
Negative operand 2
Sequence operator 2
Positive operand

operator 1 determines the combination between the first negative operand and the positive operand and operator 2 determines the combination between the second negative operand and the positive operand.

Consider the following rule.

SCOPE SENTENCE
{
    DOMAIN(dom1:NORMAL)
    {
        !ANCESTOR(47277)//  47277: military personnel, military man, servicemember, serviceman, man, service person, service man, serviceperson, military-man
        <>
        !LEMMA("avoid", "try", "attempt")
        >
        LEMMA("arrest")
    }
}

The first negative operand is combined to the only positive operand with a flexible sequence operator; the second negative operand is combined with a loose sequence operator, as illustrated below.

                !ANCESTOR(47277) flexibly followed by:
                                                        ╲
                                                         LEMMA("arrest")
                                                        ╱
!LEMMA("avoid", "try", "attempt") loosely followed by:

It is sufficient that one negative operand is matched not to satisfy the rule.
If the rule above is run against the following text:

Last night more than forty Israeli soldiers invaded the city of Nablus and raided two homes looking for two young men. One of them was arrested during the raid and the other one avoided arrest as he was working at the time of the raid.
At 2.30 am, Israeli soldiers broke into Mead Nijad's interrupting the family's sleep. As the soldiers entered the house, they ordered everyone to have their hands up; they asked for Emad, blindfolded, handcuffed and arrested him.

the rule's condition is matched just once. In fact, the positive operand matches three tokens (arrested, arrest in first sentence and arrest in the third sentence), but avoided (matched by !LEMMA("avoid", "try", "attempt")) immediately precedes the first occurrence of arrest and soldiers (matched by !ANCESTOR(47277) precedes the second occurrence from a distance.

Right reference operators

Normal sequence operators point in the forward direction. Right reference operators (single less-than sign, <, and double less-than sign, <<) point in the opposite direction.

Right reference operators can be used interchangeably with normal operators, but when all of the rule's operands are positive there's no reason for doing so since the sequences built with them are perfectly equivalent to sequences built with normal operators and they are less clear.

For example, conditions:

LEMMA("confirm") >> LEMMA("arrest")

and:

LEMMA("confirm") << LEMMA("arrest")

are perfectly equivalent, but the second contrasts with the general left-to-right reading direction which is common in the Rules language.

Right reference operators, however, are useful in changing the normal relationship between negative operands and positive operands; in fact, they should be used only when negative operands are involved.

Consider the following categorization rule, meant to give points to the dom1 domain whenever the text concerns the arrest of people.

SCOPE SENTENCE
{
    DOMAIN(dom1:NORMAL)
    {
        TYPE(NPH)
        <>
        ANCESTOR(71230, 71231)//  71230: take in, seize, arrest, apprehend, cop, collar, nab, slough, nail, sneeze, pick up  71231: catch, capture, get, captive
    }
}

The rule's condition matches a person's name (TYPE (NPH)) followed, at any distance, but in the same sentence, by any concept that descends from syncon 71230 (to arrest) or syncon 71231 (to capture).

If the rule is run against the following text:

John McAfee, the multimillionaire software developer, has not been captured, despite a cryptic post on his own blog saying that he is in police custody.

it is triggered by the John McAfee and captured sequence, but this is not what the users expect, because the text states that the man has not been captured. A more precise rule is triggered only if the verb is not negated.

The following change in rule's condition:

SCOPE SENTENCE
{
    DOMAIN(dom1:NORMAL)
    {
        TYPE(NPH)
        <>
        !KEYWORD ("has not been", "have not been", "was not", "wasn't", "were not", "weren't")
        >>
        ANCESTOR(71230, 71231)//  71230: take in, seize, arrest, apprehend, cop, collar, nab, slough, nail, sneeze, pick up  71231: catch, capture, get, captive
    }
}

seems to achieve the desired effect. The second operand should match has not been before captured but, being a negated operand, it should prevent the whole condition from being satisfied.
Indeed, the rule is not triggered (apparently a good result), but it is not due to the user's logic. If the modified rule is run against this text:

McAfee was in the midst of recounting an incident earlier this month in which 42 armed Belizean officers allegedly stormed his compound, arrested him and detained him for 14 hours with no food or water, and then let him go without charges.

it isn't triggered either and, again, this is not what the user was going for. Now we have one negative operand and two positive operands: how are they combined? The rule above is interpreted by the text intelligence engine as illustrated below.

            strictly followed by ANCESTOR(71230, 71231)
          ╱
TYPE(NPH) 
          ╲
            NOT flexibly followed by KEYWORD ("has not been", "have not...

This can be read in this way: a person's name strictly followed (>>) by any concept descending from syncon 71230 (to arrest) or syncon 71231 (to capture) and, at the same time, not flexibly followed (<>) by one of the keyword in the negated operand.
This is because of the left reference principle: the first positive operand is both the left reference of the negative operator and of the second positive operand, therefore the negative operand between the positive operands is transparent when it comes to the relationship between the two positive operands:

TYPE(NPH) >> ANCESTOR(71230, 71231)

This "positive" condition doesn't hold true because arrested is too far from McAfee, so the whole condition is not met and the rule is not triggered.
The negative operand would be evaluated like this:

TYPE(NPH) <> !KEYWORD ("has not been", "have not been", "was not", ...

and actually it holds true, but it's "too late".

Sequence operators with a right reference come to the rescue! If the rule above is changed this way:

SCOPE SENTENCE
{
    DOMAIN(dom1:NORMAL)
    {
        TYPE(NPH)
        <>
        !KEYWORD ("has not been", "have not been", "was not", "wasn't", "were not", "weren't")
        <<
        ANCESTOR(71230, 71231)//  71230: take in, seize, arrest, apprehend, cop, collar, nab, slough, nail, sneeze, pick up  71231: catch, capture, get, captive
    }
}

it is interpreted by the text intelligence engine as follows:


    TYPE(NPH)
        ╲
          flexibly followed by ANCESTOR(71230, 71231) 
                              ╱
    NOT strictly preceded by KEYWORD ("has not been", "have not...

The right reference operator takes the negative operand away from the first positive operand's "attraction" and ties it to the second positive operand which becomes the negative operand right reference.
The first operator determines the combination between the first positive operand and the next positive operand, not the combination between the first positive and the next negative. Results will now be expected for the sample texts.