Use the presence-absence filter
The Filter documents dialog allows you to filter the documents of the current library based on the presence or absence of NLU analysis results, annotations, validation flags, experiment results, structural elements and experiment metrics.
The dialog is available at various points in project dashboards where lists of documents are displayed.
Filter characteristics vary depending on the type of project.
You open the dialog by selecting Filters above the list of documents.
To close the dialog without making any changes to the filter, click anywhere outside the dialog area.
From left to right the dialog is divided into two areas:
- The features on whose presence or absence documents can be filtered are shown in the left area.
- The current filter is shown in the right area.
In the left area of the dialog, document features that can be used as filters are grouped into tabs:
Resources shows features that are related to project resources, like categories and classes. The two numbers next to each item indicate the number of extractions in the current experiment and the number of annotations.
Entities shows all the types of named entities. The number next to each type indicates the number of entities of that type found in the documents by NLU analysis.
Metrics is available only if the dialog has been opened from the documents statistics view of an experiment.
The features in the Metrics tab refer to the metrics of documents after the experiment.
See the following articles about the documents statistics view for more information:
Structure shows structural features of the documents.
- Use Documents with layout to filter the documents for which graphical layout information is available. These are document originating from PDF files that were uploaded with the Pdf document view option turned on.
Document sections list the sections defined as project resources.
The default section is not listed because is annotated by exclusion in every document.
The filter area on the right contains a green and a red box: the green box represents the positive part of the filter, i.e. the features that documents must have (presence), the red box represents the negative part of the filter, i.e. the features that the documents must not possess (absence).
Set the filters
The buttons with the filter symbol next to the features are bi- or tri-state and the effect of the click is described below.
|No such feature in the documents
|Not in filter
|Change to Presence status, the feature is added to green box in the filter area
|Presence, document must have the feature
|If bi-state, change to Not in filter status (remove the feature from the filter); if tri-state, change to Absence status (the feature is added to the red box in the filter area)
|Absence, document must not possess the feature
|Change to Not in filter status, removes the feature from the filter
Consider this example:
Geography in the green box and People in the red box means that filtered documents must contain mentions to geographical places but not names of persons.
To remove an element from the filter you can also select the X icon next to it.
The number next to the tab label in the left area is the number of filter elements taken from the tab.
In the Documents with quality area of the Quality tab, when the bi-state button next to Precision, Recall and F-measure is in the Presence status (see above), you can choose between N/A (not available) and a percentage range.
To apply the filter select Filter documents.
With the filter set, the Filters button above the list of documents shows a number corresponding to the number of filter elements.
In the more extensive views, the filtered elements are also shown next to the Filters button and can be removed directly by selecting the X icon.
In collapsed views the Filters button and the filter elements counter are shown in a compact way.
While you are in the immersive view, both filters and search criteria are reported in the Filters pop up menu in a read-only mode.
Reset the filter
You can reset the filter either in the Filter documents dialog or in condensed visualizations by selecting Reset.