Skip to content

Filter and search documents

Filter documents based on presence/absence of annotations, extractions or validations

To filter a list of documents based on the presence or absence of annotations, extractions and validations, use the Filter documents dialog.
To open it, select Filters at the top of the list.
To close it without applying changes, just click anywhere outside the dialog.

To define and apply the filter:

  1. Select Filters at the top of the document list.
  2. In the Filter documents dialog select the Resources tab. You'll see three rows:

    • Validated documents
    • Documents with annotations
    • Documents with extractions

    The button to the right of each row acts as a triple-state switch:

    Button Button color Current state Click effect
    Gray No filter Passes to "Presence" state
    Green Presence: documents must have the selcted feature Passes to "Absence" state
    Red Absence: documents must not have the selected feature Passes to "No filter" state, i.e. empties the filter
  3. Click the buttons to the right of the rows repeatedly until you get the desired effect.

  4. Select Filter documents. The dialog closes and the filter is applied.

Example

If you want to filter documents with extractions, but without annotations and validations:

  1. Double-click Documents with annotations.
  2. Click Documents with extractions.
  3. Double-click Validated documents.

The document list remains filtered until you change or remove the filter.
You can remove parts of the filter on the filter bar at the top of the document list by selecting the "X" button in the item you want to remove:

To completely remove the filter:

  1. Select Filters at the top of the document list. The Filter documents dialog appears.
  2. Select Reset to restore the initial state of the dialog.
  3. Select Filter documents to close the dialog and apply the empty filter.

Filter by concept

The document lists in the Documents and in Experiments panels can be filtered based on specific extractions and annotations or thesaurus concepts.

  1. In the left panel, select the Thesaurus tab. It displays concepts that have been annotated in the current library and concepts that were extracted at least once during the selected experiment.
  2. Select Open or Close beside Extractions and Annotation to expand or collapse the lists.
  3. Double-click one or more items inside the Extractions and Annotations lists. Clicked items become search criteria shown in the search box where they can be edited as described in the article about corpus documents' search.

filt-ann-val

Info

In the Thesaurus panel, the number to the right of Extractions and Annotations is the number of concepts with at least one extraction or annotation, not the total number of extractions or annotations.
The number to the right of each concept in the lists is the number of documents in which a concept has been extracted or annotated—possibly multiple times—and not the total number of extractions or annotation of the concept.

Filter by entity

Filter documents by entity types

To filter a list of documents based on the presence or absence of entities of given types, use the Filter documents dialog.
To open it, select Filters at the top of the list.
To close it without applying changes, just click anywhere outside the dialog.

To define and apply the filter:

  1. Select Filters at the top of the document list.
  2. In the Filter documents, dialog select the Entities tab. You'll see a row for every possible entity type. The number to the right of each row is the number of entities of that type that were recognized in documents.

    The button to the right of each row with a number of entity greater than zero acts as a triple-state switch as described above.

  3. Click the buttons to the right of the rows repeatedly until you get the desired effect.

  4. Select Filter documents. The dialog closes and the filter is applied.

The central panel, if in list view, or the left panel, if in detail view, displays the filtered documents list.

Example

To filter the documents that contain People and Company, but not Mass media:

  1. Select People and Company.
  2. Double-click Mass media.

To remove the filter:

  1. Deselect all selected items, or select Reset .
  2. Select Filter documents.

Filter by entity value

  • In the list view:

    1. In the left panel, select the Entities tab.
    2. Select an entity type.
    3. Double click an entity.
    4. Repeat from step b or c to add more entities.
  • In the detail view:

    1. In the right panel, select the Entities tab.
    2. Double click an entity.
    3. Repeat from step b to add more entities.

Double click selections become as many elements of the search criteria shown in the search box where they can be edited as described in the article about search.

Filter documents by entity, extraction, validation, annotation in the Analytics sub-tab

When in the Documents tab, Analytics sub-tab , you can filter your documents with the:

  • Documents panel.
  • Coverage panel.
  • Relevant documents sub-panel.

Note

The procedure in the Filter documents window is the same as described above.

Documents panel

  • Select the Filters icon .

Or:

  1. Select Expand .
  2. Select Filters.
  3. Select your filters and then select Filter documents.

Coverage panel

To apply a positive filter from the Coverage panel:

  • Double-click Extracted documents or the orange line on the chart to activate the Documents with extractions filter in the Filter documents window.
  • Double-click Annotated documents or the blue line on the chart to activate the Documents with annotations filter in the Filter documents window.
  • Double-click Validated documents or the turquoirse line on the chart to activate the Validated documents filter in the Filter documents window.

When done, to view the filters in the Documents panel:

  • Select the Filters icon .

Or:

  1. Select Expand .
  2. Select Filters.

Relevant documents

  • For a positive filter of your documents according to the Documents with extractions option in the Filter documents window:

    1. Check the Extractions box.
    2. Select Most extracted.

    Or, in case of negative filtering:

    1. Select Not extracted.

  • For a positive filter of your documents according to the Documents with annotations option in the Filter documents window:

    1. Check the Annotations box.
    2. Select Most annotated.

    Or, in case of negative filtering:

    1. Select Not annotated.

When done, to view the filters in the Documents panel:

  • Select the Filters icon .

Or:

  1. Select Expand .
  2. Select Filters.

Filter by token value

  • In the list view:

    1. In the left panel, select the Tokens tab.
    2. Select a token type.
    3. Double click a token value.
    4. Repeat from step b to add more tokens.
  • In the detail view:

    1. In the right panel, select the Tokens tab.
    2. Expand a token type.
    3. Double click a token.
    4. Repeat from step b or c to add more tokens.

Double-click selections become search criteria and are displayed in the search box where they can be edited as described in the search.

Filter by Main topics

To filter on a main topic, when in detail view, double-click one or more topics displayed in the Main topics strip above the text of the document.

Filter by document name

To filter documents based on file name, in the list view—or in the Documents Preview panel of the Resources tab—enter the file name or part of it (at least 3 chars) in the search box above the list of documents and press Enter. Only documents whose file name contains the specified string will be displayed.
To cancel the filter, select on the right of the search bar.

filt-doc

Filter by language

To filter documents based on their language, in list or in detail view—or in the Documents Preview panel of the Resources tab—select one of the options from the drop-down menu above the document list.

"In list" Vs. "Not in list"

When applying filters or performing searches when in detail view, documents are marked by two different icons:

Search documents

To carry out a search, proceed exactly as in the case of a search in a corpus, then refer to the article in which it is described.

Find text in a document

To find text in a document, in detail view, enter the search criteria in the Find text bar (minimum three characters), then press Enter.

To cancel the search, select on the right of the search bar.

Display annotations and extractions in a document

In the detail view, it is possible to display the annotations and the extractions within a document, if any, by selecting the related icons:

Display the concept quality within a document

In the detail view, it is possible to display the concept quality within a document, if any, by selecting the related icons:

  • for True positive.
  • for False positive.
  • for False negative.