Skip to content

Basic extraction configuration

Introduction

Project models can identify and extract the mentions of the taxonomy concepts from documents.

Use the Extraction tab of the Edit concept panel its toolbar to change the basic extraction settings for a concept.

The extraction method and the context settings are determined by the project's extraction settings which in turn derive from the choices made during project creation. More details in the dedicated pages.

More advanced settings can be managed in the Advanced extraction tab.

Toggle extraction

Use the Extraction toggle on the panel toolbar to turn extraction on and off.
If extraction is disabled, the concept label in the taxonomy tree is stricken through.

Co-occurrence constraints

You can specify terms that must or must not co-occur with mentions of the concept to extract inside the extraction context.
For example, you may want the concept of chair to be extracted only if the term president is not also present in the same paragraph.
Under MANDATORY CONTEXT TERMS and FORBIDDEN CONTEXT TERMS you can put terms that, respectively, must be present or must not be present in the context for the extraction to take place.

  • To add a term:

    1. Select the plus button , type the term and press Enter.
    2. From the Case drop-down menu, select the case option: you can select:

      • Default
      • Sensitive
      • Insensitive

      Note

      Case options correspond to the following extraction methods:

      • Exact label
      • Exact label same case
      • Exact label case insensitive

      Check the settings page for details.

  • To edit a term, select it and change it.

  • To delete a term, hover over it and select the X icon .

Forbidden forms

In case of extraction with semantic or base form methods, the model extracts all the inflected forms of the concept labels.
If you want some forms to be ignored, add them to the FORBIDDEN FORMS column.

  • To add a form:

    1. Select the plus button , type the form and press Enter.
    2. From the Case drop-down menu, select the case option: you can select:

      • Default
      • Sensitive
      • Insensitive

      Note

      As mentioned above, check the settings page for details.

  • To edit a term, select it and change it.

  • To delete a form, hover over it and select the X icon .