Create an extraction project
Possibilities
You can create a new extraction project using a wizard that you launch on the main dashboard.
There are two possibilities:
- Create the project from scratch.
- Bootstrap the project by importing a CPK. This already provides the project with a model to experiment with. In a project like that the definitions of information classes are taken from inside the CPK and become immutable, users cannot add, remove, rename or re-organize them.
Creation from scratch: first steps
Launch the wizard
- Select the plus button
and then Extraction project
Or:
-
If there are no projects or corpora:
- Select the Extraction project card.
Or:
-
If there are already other projects or corpora, but no extraction project:
- Select Extraction
in the left menu, then Create your first project.
- Select Extraction
Or:
-
If there are already other extraction projects:
- Select Extraction
in the left menu, then select New extraction project
in the Create project area.
Or:
- Select New extraction project
in the Create item area.
- Select Extraction
First step: project info and tech version
Enter the project name, select the tech version1 and optionally enter a description, then select Create.
Second step: project language
Select the project language, then select Next.
Third step: project resources
In the Project resources page you must define—or just draft—the project resources. You can always edit them once the project has been created.
Use one of the following procedures:
- To define information classes from scratch select Create project classes then select Next.
- In the Create classes dialog enter the name of a class and then press
Enter
to confirm and possibly add another. When done, select Next. - Defined classes are displayed in the Resources panel. It is possible to edit them, manage sections and define metadata classes. Select Next to go on.
Or:
-
To import the definition of information classes from a JSON file:
- Select Import classes.
- Open a file containing the JSON.
Defined classes are displayed in the Resources panel. It is possible to edit them and also to define sections and metadata classes.
Select Next to go on.
Creation with a CPK: first steps
Launch the wizard
- Select the plus button
then Upload CPK.
Or:
- Select Upload CPK
in the Create item area.
Or:
- Select Extraction
in the left column then Upload CPK in the Create project area for a extraction project.
First step: project type, CPK, project info and tech version
- In the Create a new project from a CPK dialog, select Browse files to choose the CPK file or Replace files if you want to replace it after you chose one.
- If you started the wizard in one of the first two ways described above, select the type of project, that is Extraction.
- Enter the project name, select the tech version1 and optionally enter a description, then select Create project.
Second step: check project resources
The resources of the project, that is the information classes and any definition of sections, are automatically taken from the CPK and become immutable: the padlock icon will appear next to the project name to symbolize this fact.
Select Check resources to examine those resources and Next when you are ready to go to the next step, otherwise, if you are not interested in checking the imported resources, select Skip this step to continue the wizard.
Common step: library
The wizard requires the creation of a initial library to train or test models. The library can be manipulated and other libraries can be added once the project has been created.
If you create the initial category tree with the magic taxonomy tool (see above), you are required to define the library during that step of the wizard, otherwise the wizards asks you to define the library after the initial resources of the project have been defined.
When the wizard asks you to define the library:
- Optionally enter the library name. The name will be chosen automatically if you don't specify one.
- Select the type of library and select Next.
-
In Corpora and folders, select the source for the library. You can select an existing corpus or upload documents from the file system.
-
If you choose an existing corpus, select the corpus and then select Next.
Warning
Listed corpora are either owned by you or shared with you and, between them, are only those associated to the tech version you previously selected for the new project.
-
If you choose to upload documents, learn of to do by reading the article dedicated to the topic.
It is possible to upload an annotated library.
The upload process creates an ephemeral corpus that is listed and can be chosen as a source of documents for the library.
If there are many corpora, you can use these navigation tools:
- Use the search bar to look for a corpus. Type at least three characters of the corpus name.
- Select Show table view
to list corpora in a table format.
- Select Show card view
to display corpora as cards.
- When in table view, you can sort items by selecting the desired column header.
- When in card view, select Share
to share the corpora with other users.
The information shown for corpora is the same displayed in the Corpus info sub-panel of the main dashboard.
-
-
When done, select Next.
Creation from scratch: final step
The final step is a summary of the wizard.
Select Open project to end the wizard and start working on the project.
Creation from a CPK: final step
The final step is a summary of the wizard.
Turn on Start an experiment if you want to immediately start an experiment, using the CPK model against the document library, when you open the project.
Select Open project to end the wizard and start working on the project.
Wizard control
- To quit the wizard at any step, select Exit wizard in the upper right corner or select the expert.ai icon in the upper left corner of the page.
- When asked to save changes, you can select Delete project to delete the project. If your changes are saved you can interrupt the wizard and resume it at a later time by selecting the project on the main dashboard.