Use the presence-absence filter
The Filter documents dialog allows you to set up a filter on the documents of the current library based on the presence or absence of annotations, validation flags, experiment results and types of named entities.
The dialog is available at various points in project dashboards where lists of documents are displayed.
You open the dialog by selecting Filters above the list of documents.
To close the dialog without making any changes to the filter, click anywhere outside the dialog area.
From left to right the dialog is divided into two areas:
- The features on whose presence or absence documents can be filtered are shown in the left area.
- The current filter is shown in the right area.
In the left area of the dialog, document features that can be used as filters are grouped into tabs:
Resources shows features that are related to project resources, like categories and classes. The two numbers next to each item indicate the number of extractions in the current experiment and the number of annotations.
Entities shows all the types of named entities. The number next to each type indicates the number of entities of that type found in the documents.
Quality is available only if the dialog has been opened from the document statistics page of an experiment. More info here:
- Document statistics for categorization experiments
- Document statistics for extraction experiments
- Document statistics from thesaurus experiments
The features in the Quality tab refer to the metrics of the documents after the current experiment.
The filter area on the right contains a green and a red box: the green box represents the positive part of the filter, i.e. the features that documents must have (presence), the red box represents the negative part of the filter, i.e. the features that the documents must not possess (absence).
Set the filters
The buttons with the filter symbol next to the features are bi- or tri-state and the effect of the click is described below.
|Light gray||No such feature in the documents||N/A|
|Dark gray||Not in filter||Change to Presence status, the feature is added to green box in the filter area|
|Green||Presence, document must have the feature||If bi-state, change to Not in filter status (remove the feature from the filter); if tri-state, change to Absence status (the feature is added to the red box in the filter area)|
|Red||Absence, document must not possess the feature||Change to Not in filter status, removes the feature from the filter|
Consider this example:
Geography in the green box and People in the red box means that filtered documents must contain mentions to geographical places but not names of persons.
To remove an element from the filter you can also select the X icon next to it.
The number next to the tab label in the left area is the number of filter elements taken from the tab.
In the Documents with quality area of the Quality tab, when the bi-state button next to Precision, Recall and F-measure is in the Presence status (see above), you can choose between N/A (not available) and a percentage range.
To apply the filter select Filter documents.
With the filter set, the Filters button above the list of documents shows a number corresponding to the number of filter elements.
In the more extensive views, the filtered elements are also shown next to the Filters button and can be removed directly by selecting the X icon.
In collapsed views the Filters button and the filter elements counter are shown in a compact way.
Reset the filter
You can reset the filter either in the Filter documents dialog or in condensed visualizations by selecting Reset.