Skip to content

Create an extraction project

Platform provides the following ways to create a new extraction project:

Create an extraction project with the Extraction wizard

The wizard procedure depends on the number of items already present in the dashboard.

  1. Go to the main dashboard.

If there are no projects or corpora created

  1. Select the Extraction project card.

Or:

  1. Select the plus button and then Extraction project. This creation procedure is always available.

If there are other existing projects or corpora but not extraction projects

  1. Select Extraction in the left menu, then Create your first project.

If there are other existing extraction projects

  1. Select Extraction in the left menu, then select New Extraction project in the Create project area.

Or:

  1. Select New Extraction project in the Create item area

Common preliminary steps

  1. In the New Extraction project dialog enter the mandatory extraction project name in Project name, select the tech version from the Tech version dropdown menu and enter the optional description in Description.
  2. Select Create.

Warning

Once the tech version has been selected, it cannot be changed. It is necessary to re-create the project with a different one.

First step: Project language

In the Project language page, select the project language for your analysis model, then select Next.

Second step: Project resources

In this step you define all or some of the resources of the project. You can always edit the resources at a later time once the project has been created.

In the Project resources page:

  • To define positional classes from scratch, select Create project classes then Next.

    1. In the Create classes dialog, enter the name of a class and press Enter.
    2. Repeat step 1 to define other classes.
    3. Select Next.
  • To import the definition of classes from a JSON file:

    1. Select Import classes.
    2. Open a file containing the JSON.

Defined classes are displayed in the Resources panel. It is possible to edit them and also to define sections and metadata classes.

Select Next to go on.

Third step: Project library

During this step you provide one of the libraries needed to train and test the ML model. More libraries can be added later.

In the Project library dialog:

  1. Enter the library name in Library name (optional step).
  2. Select the type of library then select Next.
  3. In Corpora and folders, select the source for the library. You can select an existing corpus or upload documents from the file system.

    If you choose an existing corpus:

    • Select the corpus.

      Info

      If you want to use a corpus, you can use these tools to find it:

      • Use the search bar to look for a corpus. Your search must contain at least three characters.
      • Select Show table view to view your corpora in a table format.
      • Select Show card view to view your corpora in a card format.
      • When in table view, you can sort items by selecting the desired column header.
      • When in card view, select Share to share the corpora with other users.

      The information displayed in the existing corpora is the same displayed in the Corpus info sub-panel of the main dashboard.

      Warning

      Corpora displayed are related to the Tech version selected previously in the the New categorization project dialog.

    If you choose to upload documents, see the dedicated article. When the upload is complete, a temporary uploaded corpus is created and made available in the window.

  4. When done, select Next.

Info

It is also possible to upload an annotated library.

Fourth step: Summary

The last step shows the project details of the previous steps, like the project name, the project language, the resources, the tech version and the library.

Select Open project to end the wizard process and start working on the project.

Info

To quit the wizard at any time:

  1. Select Exit wizard in the upper right corner or the expert.ai icon in the upper left one.
  2. In the save changes dialog you can select:
    • Cancel to quit.
    • Delete project to delete the project.
    • Save to save the project at that step and then reopen it from the main dashboard and continue with the wizard at a later time.

Create an extraction project with an imported CPK

To start the import procedure, in the main dashboard:

  1. Select:

    a. The plus button , then Upload CPK .

    Or:

    b. Upload CPK in the Create item area.

    Or:

    c. Extraction in the left column then Upload CPK in the Create project area for an extraction project.

  2. In the Create a new project from a CPK window, select Browse files to get the CPK file or Replace files, if you want to replace it once selected.

    Note

    If the CPK contains sections, they will be automatically defined on Platform.

  3. If you selected 1.a. or 1.b., select the type of project between Categorization and Extraction.

  4. Enter the project name in Project name and select the tech version from the Tech version dropdown menu.
  5. Enter an optional description in Description.
  6. Select Create project to start the import.
  7. Select Check resources in Project resources to check them, then Next to go on, or select Skip this step to directly go on.
  8. Create the library as already described in the project creation pages for categorization and extraction projects.
  9. Select Open project in the Project summary page to end the creation process and start to work on the project.

Info

If you import a CPK project, you cannot change the resources (taxonomy or classes).

This is marked with a padlock.

Warning

CPK file maximum size is 2GB