Categorization
The Categorization tool window shows the categorization results of the last analysis carried out on the test document specified in the context bar.
Its panels are:
- Categories
- Hits/Blocks
- Rule preview
Categories panel
This panel shows the categories that were returned by the categorization process and—as a detail for each category—the rules, in the form of snippets, that determined that outcome.
There's a row for each output category.
The W symbol marks "winners", the L symbol marks "losers".
"Losing" categories are those that, through scripting, were ruled out of final output, but can still be visualized in Studio.
The columns are:
Column | Description |
---|---|
Category/Rules | Category name or rule's condition |
Label | (Only for categories) Category description |
Score | The overall score of the category or, for rules' snippets, the contribution of the rule to the score |
Frequency | (Only for categories) The score computed as a percentage of the sum of the scores of all the categories |
Hits | (Only for rules) The number of times the rule was matched |
Compound | (Only for categories) Compound score |
Quality | (Only for categories) Quality check result |
The quality check result appears only if the category has been set as a categorization target for the test document through annotation.
The Compound column appears only if in the Studio Settings window, the Tool Windows > Categorization > Show Compound column in categories tree configuration property is set to true. For a description of the compound score read about the CHIILD_TO_FATHER
option in the Studio languages reference manual.
These are the icons used to describe quality check results:
Icon | Name | Description |
---|---|---|
TP | True positive, the annotated target was matched | |
FP | False positive, unexpected result, no annotation for this result | |
FN | False negative, there were no results for the annotated target | |
Quality undefined | Incomplete match between actual extractions and targets: some values match annotations, other values don't |
The toolbar contains:
Icon | Name | Description |
---|---|---|
Expand All | Expand all the categories, showing rules | |
Collapse All | Collapse all the categories, hiding rules | |
Search box | Search criteria | |
Reset | Reset the search criteria | |
Find Next | Display the results matching the search criteria in the Find Results tool windoow | |
Toggle Label Visibility | Display or hide the Label column | |
Toggle D Hit Categories Visibility | Display or hide losing categories | |
Toggle Annotations Visibility | Display or hide the annotation toolbar |
The counter at the bottom of the panel shows the number of categories.
Context menu commands are:
Command | Description |
---|---|
Copy Category Name | Copy the category name |
Copy Category Label | Copy the category label |
Copy Category Frequency | Copy the category frequency |
View in Taxonomy | Open the Classes tool window, Taxonomy panel, and set its focus to the selected category |
Available mouse commands are:
Command | Description |
---|---|
Click a column header except Quality | Change sort order |
Click a category | Highlight all rule's hits in the test document inside the editor |
Click a rule's snippet | Highlight the rule's hits in the test document inside the editor and update the other panels |
Double-click a rule snippet | Like single click plus a toggle action if the command is repeated: highlight the rule in the rules file inside the editor or highlight the rule's hits in the test document inside the editor |
Annotation toolbar
This toolbar allows annotating results as categorization targets for future quality checks.
Name | Description |
---|---|
Annotate box | Name of the category to annotate. The box is filled automatically with the name of the selected output category, but the value can also be set by hand. |
Annotate | Annotate the specified category |
Annotate All | Annotate all the output categories as target results |
Hits/Blocks panel
This panel shows all the hits of the rule that's selected in the categories panel. For every hit, it can show the blocks, that is, the rule's sub-conditions that were satisfied and thus determined the hit.
There's a row for each hit.
Score is the contribution the hit gave to the overall category score.
The toolbar contains:
Icon | Name | Description |
---|---|---|
Expand All | Expand all the hits, showing blocks | |
Collapse All | Collapse all the hits, hiding blocks |
The counter at the bottom of the panel shows the number of hits for the selected rule.
Available mouse commands are:
Command | Description |
---|---|
Click a hit | Highlight the hit in the test document inside the editor |
Click a block | Highlight the text that was matched by the block in the test document inside the editor |
Double-click a hit | Toggle action if the command is repeated: highlight the rule in the rules source file inside the editor or highlight the rule's hits in the test document inside the editor |
Rule preview panel
This panel shows the rule that's selected in the categories panel, without the need to open the rule's source file in the editor.