Skip to content

RELEVANT attribute

Between its many tasks, the disambiguator identifies the main lemmas and the main concepts expressed in the text. The RELEVANT attribute matches any token with a lemma that is a main lemma or with a syncon attribute that corresponds to a main concept.
For example:

RELEVANT(LEMMA)

is true for every token whose lemma is between the main lemmas of the text.

The basic syntax is:

RELEVANT(list1[, list2])

where list1 and list2 can be LEMMA or SYNCON. LEMMA is used to match tokens with a lemma that is a main lemma, SYNCON to match tokens expressing one of the main concepts.

Main lemmas and main concepts have a score. With this syntax:

LEMMA:threshold

or:

SYNCON:threshold

only tokens with a lemma or concept that is in the lists of main lemmas or main concepts and have a score greater or equal to threshold are matched, where threshold is a percentage between 0 and 100.
For example, given this text:

Although Congress may leave the details of Medicare savings to be worked out next year, there is already discussion of cutting special payments to teaching hospitals and small rural hospitals. Lawmakers are also considering reducing payments to hospitals for certain outpatient services that can be performed at lower cost in doctors' offices. Medicare pays substantially higher rates for the same services when they are provided in a hospital outpatient department rather than a doctor's office. The differential added $1.5 billion to Medicare costs last year, and as hospitals buy physician practices around the country, the costs are likely to grow, the Medicare commission says.

the main lemmas identified by the disambiguator are:

Lemma Score
Medicare 17.9%
hospital 9.7%
doctor 7.3%
outpatient 6.8%
teaching hospital 6.7%
payment 6.3%
discussion 6.0%

Then, this attribute:

RELEVANT(LEMMA:7%)

matches all the tokens with lemma Medicare, hospital and doctor because these are the main lemmas with a score greater or equal to 7%.

Warning

Using the RELEVANT attribute alone in the condition of a rule may cause text analysis to be relatively slow. If possible, use it in combination with other attributes to avoid this issue.