The Documents tab
Below you will find the peculiarities of the Documents tab of the project dashboard of extraction projects.
The characteristics of the tab that are common to all project types are described in the article dedicated to the topic.
Sort the document list
In addition to the sorting options common with other project types, the Extractions sorting option available in the list view of the Documents tab, in the context of an experiment, allows you to sort the list of documents based on the number of extractions.
Detail view's central panel
In the default visualization of the detail view, the third bar from the top in the central panel shows buttons displaying the number of:
- Extractions
- Annotations
- True positive
- False positive
- False negative
The buttons also act as toggle switches: to toggle the highlight of all the extractions, annotations, true positives, etc., select the corresponding button.
You can select one or more of the first two buttons or one or more of the second three, but the two groups are mutually exclusive: if you select a button of one group, any buttons selected in the other group are deselected.
Extraction and annotation buttons are dropdown menus. Expand them to filter the highlights based on a specific class.
When a class is selected, the dropdown arrow changes its color .
Note
Use the search bar to look for classes. Select the X icon to reset the search criteria.
To see all highlighted classes again, expand the menu and deselect the class.
Warning
You cannot filter the highlights based on more than one class belonging to a single group.
The Classes tab
In list view
In list view, the Classes tab is in in the left panel and lists all project classes.
Classes that have not been annotated or—if an experiment is selected—extracted by the experiment model, are grayed out.
The two numbers next to each class name are, respectively, the number of distinct values that have been extracted by the model experiment and the number of distinct values that have been annotated.
- To filter the list of classes, type the name of a class or part of it in the Filter classes box and press
Enter
. The match is case insensitive. Select the X icon inside the box to cancel the filter. - To switch to the Resources tab of the project dashboard and show the detail of a class, hover over the class and select Show in resources .
- To show more information about a class, hover over it and select Show info .
- To switch to the context view, hover over the class and select Context view .
- To expand and collapse your groups, select Expand all groups / Collapse all groups .
- Select the expanding arrow and the collapsing arrow to expand and collapse a specific group.
In some cases, some classes—grouped or ungrouped—may have neither annotations nor extractions.
As mentioned above, these classes are grayed out. When there are groups made entirely of empty classes, the groups are collapsed by default. If a group has at least a single class with annotations, the group is expanded by default.
-
To hide these classes, select Hide empty classes and groups . All empty classes—and groups entirely made of empty classes—will disappear.
Note
If a group contains at least a class with annotations, the group itself will not be hidden but only its empty classes.
-
To show empty classes again, select Show empty classes and groups .
To expand the lists of extracted and annotated values for a class, select it or select the chevron icon to the right of the class. To collapse the lists, select Go back .
When you expand a class, you see two lists: Extractions and Annotations. If an experiment is selected, under Extractions you'll find the distinct values that were detected by the experiment model.
Under Annotations are listed the distinct values that were annotated as expected results.
The numbers beside the Extractions and Annotations headings are, respectively, the number of distinct values detected by the experiment model and the number of distinct values that have been annotated in the library.
The number beside each value under Extractions is the number of documents in the current list from which that value was extracted by the experiment model. Similarly, the number beside each value under Annotations is the number of documents in which that value was annotated.
Note
In case of a project with no annotations and no experiments, no information is displayed in the tab.
- To filter the lists, type a value or the initial part of it in the Filter list box and press
Enter
. The match is case sensitive. Select the X icon inside the box to cancel the filter. - To change the sort order, select the desired option from the dropdown menu at the top right of the list.
- Double-click an item to insert it in the search bar as criteria for document search.
If a list appears truncated, select Open beside its name to give the list maximum space. Select Close to revert to previous visualization.
In detail view
The Resources tab on the right of the detail view shows the classes and the related annotations and extractions.
In case of a new project with no annotations and experiments, no details are available. In case of annotations and more experiments, details about extracted classes are available for the latest experiment.
Select the dropdown menu—set on All classes by default— to expand a class. When you expand a class, you see two lists: Extractions and Annotations. The information and the actions that can be performed are the same described about the list view when a class is expanded.
To filter the lists based on a specific value, type a value or the initial part of it in the Filter list box and press Enter
. The match is case sensitive. Select the X icon inside the box to cancel the filter.
The Resources tab has some other useful tools that are used to annotate.
If there are extraction aggregations, two more sub-tabs are added:
- Classes
- Groups
The Classes sub-tab shows the default view with all classes including those belonging to groups.
The Groups sub-tab shows all project groups with their classes.
Select the dropdown menu—set on All groups by default— to filter the group list.
Annotation controls
In extraction projects, both the list view and detail view of the Documents tab have numerous controls for annotating classes and sections.
Segments
Segments are block of text identified by specific rules in Studio projects.
Segments are part of the output of experiments based on CPK models imported from Studio. When detected, they are listed under the Tokens tab, which is inside the left panel in list view and inside the right panel in detail view.
Sections
In the detail view and its variants you can toggle the display of sections. To do this, use the Toggle sections toggle switch on the toolbar next to the document name. The button is enabled only if sections have been defined.
In the absence of explicit annotations, all text belongs to the standard section.
In the dedicated article you will find information on how to annotate sections.
Extraction aggregations
The aggregation of extraction results is available in case of:
- An extraction project based on an imported CPK.
- A group with at least two extracted classes in a document.
This feature is available only in immersive view and aggregates extraction experiment results.
To toggle the aggregation while in immersive view, select Toggle aggregations on the toolbar. To untoggle the aggregation, select Toggle aggregations .
This is what you see when the button above is not selected.
This is what you see when the button above is selected.
As you can see, only true and false positives are visible when the aggregation is activated.
Use the Filter list bar to look for classes.
Note
The toolbar beside the Filter list bar is already described in the annotation section.
Scores
To toggle the display of the confidence score for extractions, select the ellipsis from the toolbar above the document text, then select Show value scores.
The score will appear beside the extracted values.
For false negative values, the score will be 0.
The score is also visible:
- In the immersive view and with aggregated extractions.
-
In the annotation pop-up toolbar.
Open a document in Studio
It is possible to copy a document and its annotations to Studio and have it automatically be prepared and analyzed there.
Requirements:
- Studio version 4.0.0 or later.
- The Platform project must have been previously linked to the Studio project.
To trigger the action, when in detail view, select Open document in Studio .
If Studio is not running, it will start automatically and open the latest project you worked on. If not connected to the Platform, but with a profile for that instance, it will automatically connect to it.
Debug information
If the current experiment returned results for the document, in detail view it's possible to have debug information that helps explaining each result.
To see debug information about the output extractions, select Toggle debug extractions . The Debug panel will appear on the right.
The amount and the type of information depends on the model type that was chosen for the experiment, as exemplified in the figures below.
-
Explainable Extraction and Studio
-
Other ML models:
The information common to all model types is:
- Class name
- Group name, in brackets, when available
- Extraction
- Position of the text corresponding to the extraction
- Extraction score
For CRF models, also the name, the value and the number of occurrences of the text features that determined the prediction are shown.
To highlight the portion of text a feature was extracted from, just select the feature in the list. All the occurrences of the feature will be highlighted in the document's text.
For Explainable Extraction and Studio models, the rules that were triggered causing the output of the extraction are listed.
To highlight the hits of a rule, select the rule from the list. For each hit, the rule scope (for example, the sentence) is highlighted.
The selection of multiple rules with Ctrl+Click
is possible.
When highlighting hits of a rule or the text features that determined the prediction, a highlight counter is displayed beside the panel name. Select the X to remove all highlights.