Skip to content

Basic extraction configuration

Introduction

Project models can identify and extract the mentions of the taxonomy concepts from documents.

Use the Extraction tab of the Edit concept panel its toolbar to change the basic extraction settings for a concept.

The initial value of some of these settings for new concepts is determined by project's extraction settings which in turn derive from the choices made during project creation.

More advanced settings can be managed in the Advanced extraction tab.

Toggle extraction

Use the Extraction toggle on the panel toolbar to turn extraction on and off.
If extraction is disabled, the concept label in the taxonomy tree is stricken through.

Co-occurrence constraints

You can specify terms that must or must not co-occur with mentions of the concept to extract inside the extraction context.
For example, you may want the concept of chair to be extracted only if the term president is not also present in the same paragraph.
Under MANDATORY CONTEXT TERMS and FORBIDDEN CONTEXT TERMS you can put terms that, respectively, must be present or must not be present in the context for the extraction to take place.

  • To add a term, select the plus button below the column header, type the term and press Enter.
  • To edit a term, select it and change it.
  • To delete a term, hover over it and select the X icon .

Forbidden forms

In case of extraction with semantic or base form methods, the model extracts all the inflected forms of the concept labels.
If you want some forms to be ignored, add them to the FORBIDDEN FORMS column.

  • To add a form, select the plus button below the column header, type the form and press Enter.
  • To edit a term, select it and change it.
  • To delete a form, hover over it and select the X icon .