Skip to content

Create a thesaurus project

Start the wizard

The wizard procedure to create a thesaurus project depends on the number of items already present in the dashboard.

  1. Go to the main dashboard

If there are no projects or corpora created

  1. Select the Thesaurus project card.

Or:

  1. Select the plus button and then New Thesaurus Project. This creation procedure is always available.

If there are other existing projects or corpora but not thesaurus projects

  1. Select Thesaurus in the left menu, then Create your first project.

If there are other existing thesaurus projects

  1. Select Thesaurus in the left menu, then select New thesaurus project in the Create project area.

Or:

  1. Select New thesaurus project in the Create item area

Common preliminary steps

  1. In the New Thesaurus project dialog enter the mandatory thesaurus project name in Project name, select the technology version from the Tech version drop-down menu and enter the optional description in Description.
  2. Select Create.

First step: language preferences

In the Project language page of the creation wizard:

  1. Select the project languages. The first language you select is automatically marked as Favorite. In case of multiple languages, select the star beside the other language to turn it into the favorite one.
  2. Select Next.

Second step: resources

In the Project resources page, select how to create the resources.

  1. Select:

    • Import thesaurus to import a SKOS thesaurus definition file in RDF/XML format.

    Or:

    1. Create project resources to create concepts from scratch, then select Next.
    2. In the Create thesaurus window, enter one or more words representing a concept and then press Enter.
    3. You can repeat the step above as many times as you like, but only one concept is necessary to create the project as you can add more concepts later.
    4. Select Next.
  2. Defined concepts are displayed in the Resources and Edit concept panels, where you can edit them. It is not mandatory to edit concepts, you can do it later, if needed, once the project has been created.

  3. Select Next.

Third step: documents

In the Project library page:

  1. Enter the library name in Library name or confirm the suggested name, then select Next.
  2. Select the source for the library.

    • Select an existing corpus. If so:

      • Use the search bar to look for a corpus. Your search must contain at least three characters.
      • Select Show table view to view your corpora in a table format.
      • Select Show card view to view your corpora in a card format.
      • When in card view, you can sort items by selecting one of the options from the drop-down menu.
      • When in table view, you can sort items by selecting the desired column header.

      Note

      The information displayed in the existing corpora is the same displayed in the Corpus info sub-panel of the main dashboard.

    Or:

    • Upload documents

      1. Select Upload.

        • Select Show advanced settings:

          • If you want to disable automatic language detection, turn off Autodetect language and choose the language from the Select language drop-down list.
          • If you want to disable automatic character encoding detection, turn off Autodetect encoding and choose the encoding from the Select encoding drop-down list.
          • If you want to save the documents as a corpus, turn on the Save as corpus button and enter the corpus name.
          • When done, select Hide advanced settings.
      2. Select Add files.

      3. Select the files to upload. The selected files are displayed in a list. You can delete one or more files by clicking on the "X" button at the right of the file name.

        Supported formats and limits

        Supported document formats are those managed by the Apache Tika toolkit. Documents are automatically converted to plain text files during upload.

        Documents are ignored if:

        • They are empty.
        • They mainly consist of nonsense words.
        • (In case of automatic language recognition) Their language is unrecognized or not supported.
        • They exceed the following values:

          • 200 MB for .zip files.
          • 1 MB for .txt files.
          • 100 MB for other file types.

      1. Select Upload. When the upload is complete, a corpus is created and listed in the window. It can be temporary or persist if you chose Save as corpus in the advanced settings.

      2. Select the corpus you created by uploading documents or another pre-existing corpus.

      3. Select Next.

Final step: Summary

The last step of the wizard sums up project information.

Note

The number of stars in Thesaurus Quality represents the project quality in terms of information like concepts and documents.

Select Open project to end the wizard process and start working on the project.

Info

To quit the wizard at any time:

  1. Select Exit wizard in the upper right corner.

Or:

  1. Select the expert.ai icon in the upper left one.
  2. In the save changes dialog you can select:
    • Cancel to quit.
    • Delete project to delete the project.
    • Save to save the project at that step and then reopen it from the main dashboard and continue with the wizard at a later time.