Skip to content

Create a categorization project

Possibilities

You can create a new categorization project using a wizard that you launch on the main dashboard.

There are two possibilities:

  • Create the project from scratch.
  • Bootstrap the project by importing a CPK. This already provides the project with a model to experiment with. In a project like that the definitions of categories are taken from inside the CPK and become immutable, users cannot add, remove, rename or re-organize them.

Creation from scratch: first steps

Launch the wizard

  • Select the plus button and then Categorization project

Or:

  • If there are no projects or corpora:

    • Select the Categorization project card.

Or:

  • If there are already other projects or corpora, but no categorization project:

    • Select Categorization in the left menu, then Create your first project.

Or:

  • If there are already other categorization projects:

    • Select Categorization in the left menu, then select New Categorization project in the Create project area.

    Or:

    • Select New Categorization project in the Create item area.

First step: project info and tech version

Enter the project name, select the tech version1 and optionally enter a description, then select Create.

Second step: project language

Select the project language, then select Next.

Third step: project resources

In the Project resources page you must define—or just draft—the project resources. You can always edit them once the project has been created.

Use one of the following procedures:

  1. To define categories from scratch select Create project Taxonomy then select Next.
  2. In the Create Taxonomy dialog enter the label of a category and then press Enter to confirm and possibly add another. When done, select Next.
  3. Categories are displayed in the Resources tab. It is possible to edit them and define text sections. Select Next to go on.

Or:

  1. To import categories' definitions from an XML file select Import Taxonomy.
  2. Locate the XML file in the file system.
  3. Imported categories are displayed in the Resources tab. It is possible to edit then and define text sections. Select Next to go on.

Or:

  1. Platform can automatically suggest categories based on the text of documents. It finds similarities between documents and derives a category for each similarity. To have Platform automatically suggest categories select Build Magic Taxonomy then select Next.
  2. Define the library to use to generate the category tree.

    Tip

    A minimum of 20 documents is required to obtain meaningful suggestions.

    At the end of the process, library documents will be automatically annotated with the categories they contributed to derive.

  3. In the Magic Taxonomy dialog:

    • Select Clustering based, which is suggested for medium-large documents, then:

      • Select One level taxonomy or Two level taxonomy to define the depth of the category tree.
      • Switch on Manual configuration if you want to set the number of categories and the mode, such as Strict mode or Soft mode.

    Or:

    • Select Sequence based, which is suggested for small documents, if you want categories to derive from words sequences, then enter the number of desired categories in Number of nodes.
  4. Select Next. The category three is displayed in the Resources tab. It is possible to edit it and create sections.

  5. Select Next to go the final step of the wizard.

Creation with a CPK: first steps

Launch the wizard

  • Select the plus button then Upload CPK.

Or:

  • Select Upload CPK in the Create item area.

Or:

  • Select Categorization in the left column then Upload CPK in the Create project area for a categorization project.

First step: project type, CPK, project info and tech version

  1. In the Create a new project from a CPK dialog, select Browse files to choose the CPK file or Replace files if you want to replace it after you chose one.
  2. If you started the wizard in one of the first two ways described above, select the type of project, that is Categorization.
  3. Enter the project name, select the tech version1 and optionally enter a description, then select Create project.

Second step: check project resources

The resources of the project, that is the categories and any definition of text sections, are automatically taken from the CPK and become immutable: the padlock icon will appear next to the project name to symbolize this fact.
Select Check resources to examine those resources and Next when you are ready to go to the next step, otherwise, if you are not interested in checking the imported resources, select Skip this step to continue the wizard.

Common step: library

The wizard requires the creation of a initial library to train or test models. The library can be manipulated and other libraries can be added once the project has been created.
If you create the initial category tree with the magic taxonomy tool (see above), you are required to define the library during that step of the wizard, otherwise the wizards asks you to define the library after the initial resources of the project have been defined.

When the wizard asks you to define the library:

  1. Optionally enter the library name. The name will be chosen automatically if you don't specify one.
  2. Select the type of library and select Next.
  3. In Corpora and folders, select the source for the library. You can select an existing corpus or upload documents from the file system.

    • If you choose an existing corpus, select the corpus and then select Next.

      Warning

      Listed corpora are either owned by you or shared with you and, between them, are only those associated to the tech version you previously selected for the new project.

    • If you choose to upload documents, learn of to do by reading the article dedicated to the topic.
      It is possible to upload an annotated library.
      The upload process creates an ephemeral corpus that is listed and can be chosen as a source of documents for the library.

    If there are many corpora, you can use these navigation tools:

    • Use the search bar to look for a corpus. Type at least three characters of the corpus name.
    • Select Show table view to list corpora in a table format.
    • Select Show card view to display corpora as cards.
    • When in table view, you can sort items by selecting the desired column header.
    • When in card view, select Share to share the corpora with other users.

    The information shown for corpora is the same displayed in the Corpus info sub-panel of the main dashboard.

  4. When done, select Next.

Creation from scratch: final step

The final step is a summary of the wizard.
Select Open project to end the wizard and start working on the project.

Creation from a CPK: final step

The final step is a summary of the wizard.
Turn on Start an experiment if you want to immediately start an experiment, using the CPK model against the document library, when you open the project. Select Open project to end the wizard and start working on the project.

Wizard control

  • To quit the wizard at any step, select Exit wizard in the upper right corner or select the expert.ai icon in the upper left corner of the page.
  • When asked to save changes, you can select Delete project to delete the project. If your changes are saved you can interrupt the wizard and resume it at a later time by selecting the project on the main dashboard.

  1. The project's tech version is immutable, it cannot be changed. If you need a different tech version you must create a new project.