Skip to content

Concept extraction configuration

Introduction

The models generated in a thesaurus project extract the occurrences of the taxonomy concepts from documents.

Use the Extraction tab of the Edit Concept panel to change the extraction settings for a concept. When the concept is created, the values of these settings are inherited from the corresponding project settings.

Toggle extraction

Use the Extraction toggle to turn concept extraction on and off.

Co-occurrence constraints

You can specify terms that must or must not co-occur with mentions of the concept to extract inside the extraction context.
For example, you may want the concept of chair to be extracted only if the term president is not also present in the same paragraph.
Under MANDATORY CONTEXT TERMS and FORBIDDEN CONTEXT TERMS you can put terms that, respectively, must be present or must not be present in the context for the extraction to take place.

  • To add a term, select the plus button below the column header, type the term and press Enter.
  • To edit a term, select it and change it.
  • To delete a term, hover over it and select the X icon .

Forbidden forms

In case of extraction with semantic or base form methods, the model extracts all the inflected forms of the concept labels.
If you want some forms to be ignored, add them to the FORBIDDEN FORMS column.

  • To add a form, select the plus button below the column header, type the form and press Enter.
  • To edit a term, select it and change it.
  • To delete a form, hover over it and select the X icon .