Skip to content

Categorization projects

Categorization projects are used to create text intelligence models that classify text documents.

Within each project, users have to:

  1. Determine which categories need to be recognized, that is, define the taxonomy.
  2. Collect a set of training documents and at least one set of test documents that are representative of all the possible categories.
  3. Annotate the sets of documents with the expected results.
  4. Experiment with the creation of the model based on the annotated training set and apply the model to the test sets.
  5. Evaluate the results on the test sets.
  6. Adjust the sets of documents and the annotations until results are satisfactory and repeat from step 4.
  7. Release and publish the model for practical use. For example you can use it in the Workflow part.

In this section of the manual you will find all the information you need about Platform to perform all the above operations.