Manage annotations

Introduction

Annotating documents, unlike categorization projects, where you annotate entire documents with the expected categories, means marking portions of text.

This activity is critical because, when generating the extraction model, it allows the algorithms to "learn" from these examples of desired extractions—annotations in order to give the model the ability to predict similar extractions.

Annotations should be performed for all the information classes before generating the extraction model, according to the principle: "no annotations, no extractions".

Annotations should also be created in the set of test documents, especially after experiments that return extracted classes, because they are necessary for a measurement of the extraction quality, that is, if the model has actually learned to extract.

Annotations are managed in the Documents tab.

You can manage annotations in the:

Info

If you annotate in the Pdf document view and then make experiments, the models obtained must be preceded by an Extract converter processor to be used correctly in the NL Flow part.