Matching strategy

Platform computes the metrics of an experiment—precision, recall, F-measure—by comparing the results of the experiment (categories or extractions) with the expected results (the annotations made on the analyzed documents) and classifying the outcome as true positives, false positives or false negatives¹.
The strategy used to match the results with the annotations, which in some cases can be chosen by the user, determines the outcome and, consequently, the metrics.

Matching strategy for categorization

The matching strategy used to determine the outcome of categorization experiments is as follows:

A category with a corresponding annotation is counted as one true positive
A category without a corresponding annotation is counted as one false positive
An annotation without a corresponding category is counted as one false negative

Matching strategy for extraction

The matching strategy used to determine the outcome of extraction experiments affects all classes except metadata and can be decided by the user choosing between:

Strict
Ignore value (not for thesaurus projects)
Ignore position

The user can set the matching strategy at the project level, but can then override his/her choice every time he/she launches an experiment.

The matching strategy is based on the properties of annotations and extractions, which are:

Class
Value
Position in the text

With the strict matching strategy there must be a full match of all three properties between extraction and annotation to have a true positive.

With the ignore value strategy, instead, a partial match limited to class and position is considered a true positive, so the value can differ between the extraction and the annotation, but the positions must coincide.

Finally, with the ignore position strategy, the match is based on class and value only.
All annotations of the same value count as one and all extractions of the same value also count as one.
Multiple annotations of a value considered as one "absorb" all the extractions of the same value, no matter how many they are and from which part of the text they come from, and this many-to-many match is counted as one true positive for the class-value pair.
If there is one true positive for a class-value pair there can be no false positives for that same pair, since all the positives of the class-value pair were "absorbed" by the annotations considered as one.
You have a false positive for a class-value pair, instead—and again only one—when you have extractions, no matter how many, but no annotations.
Lastly, you have one false negative for a class-value pair if you have annotations, no matter how many, but no corresponding extractions.

Note

While counting as one for metrics, all extractions are still present in the model output.

Metadata annotations don't have a position, so the matching strategy is always "ignore position", it cannot be changed.

Example for non metadata classes

For example, let's consider we have a class named CARBS. In one document the class was annotated four times in correspondence with the words highlighted below:

Are there good sugars and bad sugars? Could there be sugars that are good for health or that, at least, are not bad? Why does the sugar added to coffee has a bad name, while the sugar contained in an apple doesn't?

with these values:

Annotation	Value
1	sugars
2	sugars
3	sugar
4	sugar

For some reason, the third occurrence of sugars was not annotated.

In an experiment, five occurrences of the CARBS class are extracted from the document, with the positions highlighted below:

Are there good sugars and bad sugars? Could there be sugars that are good for health or that, at least, are not bad? Why does the sugar added to coffee has a bad name, while the sugar contained in an apple doesn't?

and all with the same normalized value sugar.

With the strict matching strategy you'll get:

True positives (TP): 2, because two extractions exactly correspond to the last two annotations.
False positives (FP): 3, because there are three extractions of sugar which, despite having the same position of some annotations for class CARBS, do not have the same value, those annotations in fact have value sugars.
False negatives (FN): 2, because the first two annotations do not have completely corresponding extractions.

With the ignore value matching strategy:

True positives (TP): 4, because the last two annotations correspond exactly to two extractions, while for the other two annotations with value sugars there are extractions with different values (sugar) but a coincident position.
False positives (FP): 1, because one extraction, corresponding to the third occurrence of sugars, does not have a corresponding annotation.
False negatives (FN): 0, because all annotations have a corresponding extraction.

With the ignore position matching strategy:

True positives (TP): 1, because there are two class-value pairs that have been annotated, but only for one of them there are extractions.
False positives (FP): 0, because there are no extractions that do not match with annotations, in fact all extractions correspond to one of the annotated class-value pairs.
False Negatives (FN): 1, because the annotated class-value pair CARBS = sugars has no corresponding extraction.

Example for metadata classes

If you have a metadata class PRESIDENT and this annotation:

PRESIDENT = Kennedy

for a test document, then if the Kennedy value is extracted there will be a true positive, no matter how many other extractions of the same value there are. If value Kennedy is not extracted, it is a false negative.
Without an annotation, if you have one or more extractions of PRESIDENT = Kennedy you'll have one false positive for the PRESIDENT class.

Matching strategy for thesauri

Models generated and used during thesaurus projects' experiments are de facto extraction models.
For this reason, the matching strategy used to determine the outcome of the experiments is comparable to that of extraction of non-metadata classes: thesaurus projects do not have explicit extraction classes and groups of classes, but the entire taxonomy of concepts is a considered a "thesaurus" group with a "concept" class, so every occurrence of an expression of any concept in the text of a document counts as one extraction of the "concept" class.

Note

The names of the group and its single class correspond to the the values of the Template name and Field name parameters that can be set in the Rules generation step of the experiment wizard.

The difference with extraction is that there is no Ignore value strategy.

Platform does not allow the annotation of negative ("unwanted") results, so there is no computation of true negatives. ↩