Skip to content

Make experiments in categorization projects

Overview

Once the resources have been set up and the documents have been annotated, you can start experiments that consist of creating the categorization ML model and applying the model to a test library.

An experiment process is based on:

The training library, or training set, consists of an annotated documents set that helps the model to learn. The training library must meet the following requirements:

  • At least two annotated documents.
  • At least two annotated categories.
  • At least one category with ten annotations.

The test library, or test set, consists of an annotated documents set parsed by the model in order to check it.

The model parses the test library in order to give the analysis results.

Platform provides the following model types for categorization projects:

  • Auto-ML Categorization.
  • Explainable Categorization.
  • Bootstrapped Studio Project, that is a simplified version of the Explainable Categorization engine.
  • Studio, that is an imported CPK.

The Explainable Categorization creates a symbolic rules set which can be exported as CPK.

The Auto-ML Categorization engine creates a Machine Learning model.

Studio creates a categorization model based on an expert.ai Studio CPK imported project.

To start an experiment:

  1. In the upper bar, select Start an experiment .
  2. In the Start an experiment dialog:

    2.1. Enter the experiment name in Name or leave empty for automatic assignment.

    2.2. Select the test library in the Test library drop-down menu.

    2.3. Select the available engine type:

    Or:

    Or:

    Or:

    and then follow the wizard. Select Back to go back to the previous stage or Cancel to quit.

    Note

    Select Hide advanced parameters to hide the advanced parameters that are marked with a blue italic caption.

  3. Check the summary, then select Start to start the experiment.

The experiment progress window is displayed during the engine process.

To terminate the process before its end, select Delete experiment.

The process consists of six sequential stages:

  1. Initialization
  2. Model generation preparation
  3. Model generation
  4. Document analysis preparation
  5. Document analysis
  6. Experiment wrap-up

Note

If the experiment fails, the Info tabappears displaying information and the type of errors. You can check also the Activity log tab for further information.

Once the process is completed, the analytics are displayed in the Experiments tab, Statistics sub-tab, where it is possible to analyze and interpret the results.

Note

Experiment results are associated to the test libraries you chose in the experiment wizard, so it is common that the Experiments tab is disabled for other libraries.

Auto-ML Categorization engine procedure

  1. Select the training library in the Training library drop-down menu and the selection policy in Training documents selection policy, then select Next to go on.
  2. Select the Machine learning model type then Next to go on.
  3. Select the Machine Learning automatic self-tuning process parameters then Next to go on.
  4. Select the Feature space (advanced): which data elements to use in feature vector creation parameters then Next to go on.
  5. Select the Model-specific hyperparameters then Next to go on.
  6. Select the Precision and recall balance parameters then Next to go on.

Explainable Categorization engine procedure

  1. Select the training library in the Training library drop-down menu and the selection policy in Training documents selection policy, then select Next to go on.
  2. Select the Generic parameters then Next to go on.
  3. Select the Categorization rules generation hyperparameters, then Next to go on.
  4. Select the Categorization "onCategorizer" optimization hyperparameters, then Next to go on.
  5. Select the Precision and recall balance parameters then Next to go on.

Bootstrapped Studio Project engine procedure

  1. Select the training library in the Training library drop-down menu and the selection policy in Training documents selection policy, then select Next to go on.
  2. Select the Categorization rules generation hyperparameters, then Next to go on.
  3. Select the Categorization "onCategorizer" optimization hyperparameters, then Next to go on.

Studio engine procedure

  1. Select the model in Model selection then Next to go on.
  2. Check the remap in Remapper then Next to go on.