Make experiments in categorization projects

Overview

Once the resources have been set up and the documents have been annotated, you can start experiments that consist of creating the categorization model and applying the model to a test library.

An experiment process is based on a:

The training library, or training set, consists of an annotated document set that helps the model to learn. The training library must meet the following requirements:

At least two annotated documents.
At least two annotated categories.
At least one category with ten annotations.

The test library, or test set, consists of an annotated document set parsed by the model in order to check it.

The model parses the test library in order to give the analysis results.

Platform provides the following model types for categorization projects:

Auto-ML Categorization.
Explainable Categorization.
Bootstrapped Studio Project, that is a simplified version of the Explainable Categorization engine.
Studio, that is an imported CPK.

The Explainable Categorization model type creates a symbolic rule set which can be exported as CPK.

The Auto-ML Categorization model type creates a Machine Learning model.

The Studio model type creates a categorization model based on an expert.ai Studio CPK imported project.

To start an experiment:

In the upper bar, select Start an experiment .
In the Start an experiment dialog:
- Enter the experiment name in Name or leave empty for an automatic assignment.
- Select the test library in the Test library drop-down menu.
- Select the available engine type:
  - Auto-ML Categorization
  Or:
  - Explainable Categorization
  Or:
  - Bootstrapped Studio Project
  Or:
  - Studio
- Select Next and follow the wizard.
Check the summary, then select Start to start the experiment.

The experiment progress window is displayed during the engine process.

To terminate the process before its end, select Delete experiment.

The process consists of six sequential stages:

Initialization
Model generation preparation
Model generation
Document analysis preparation
Document analysis
Experiment wrap-up

Note

If the experiment fails, the Info tab appears displaying information and the type of errors. You can also check the Activity log tab for further information.

Once the process is completed, the analytics are displayed in the Experiments tab, Statistics sub-tab, where it is possible to analyze and interpret the results.

Note

Experiment results are associated to the test libraries you chose in the experiment wizard, so it is common that the Experiments tab is disabled for other libraries.

Auto-ML Categorization engine procedure

Note

Select Hide advanced parameters to hide the advanced parameters that are marked with a blue italic caption.

Select Back to go back to the previous stage or Cancel to quit.

Select the training library in the Training library drop-down menu and the selection policy in Training documents selection policy, then select Next to go on.
Select the Machine learning model type (multiple selection is allowed. One to have a single experiment, 2 or maximum 3 to have experiments in parallel), then select Next to go on.
Select the Problem definition parameters, then select Next to go on.
Select the Feature space (advanced): which data elements to use in feature vector creation parameters, then select Next to go on.
Select the Model-specific hyperparameters, then select Next to go on.
Select the Precision and recall balance parameters, then select Next to go on.
Select the Machine Learning automatic self-tuning process parameters, then select Next to go on.

Explainable Categorization engine procedure

Select the training library in the Training library drop-down menu and the selection policy in Training documents selection policy, then select Next to go on.
Select the Generic parameters, then select Next to go on.
Select the Categorization rules generation hyperparameters, then select Next to go on.
Select the Categorization "onCategorizer" optimization hyperparameters, then select Next to go on.
Select the Precision and recall balance parameters, then select Next to go on.

Bootstrapped Studio Project engine procedure

Select the training library in the Training library drop-down menu and the selection policy in Training documents selection policy, then select Next to go on.
Select the Categorization rules generation hyperparameters, then select Next to go on.
Select the Categorization "onCategorizer" optimization hyperparameters, then select Next to go on.

Studio engine procedure

Select the model in Model selection, then select Next to go on.
Check the remap in Remapper, then select Next to go on.