Work with categorization rules
Steps
1
Create a categorization rule
Open the rule file in the Project tool window.
In the Classes tool window, Taxonomy tab, right-click the domain for which you want to write a rule and select Create rule.
The structure of the rule containing the reference to the domain you chose will be generated.
2
Use the KEYWORD attribute in your categorization rule
Use the shortcut Ctrl+Shift+K
to automatically set the structure of the KEYWORD
attribute. Then write your keyword(s) between the quotation marks. Remember that a keyword is a specific string of characters and it is case insensitive when everything is written in lowercase; otherwise it is case sensitive.
Move to the upper-right corner of the GUI and select Build or press F6
to compile the project.
Pick a test file and open it in the editor, then go back to the upper-right side of the GUI and select Analyze Document or press F5
.
Select the Categorization tool window at the bottom to check if the rule has triggered. Expand the category in the results and select the rule's snippet to highlight the rule hits.
3
Use the LEMMA attribute in your categorization rule
Use the shortcut Ctrl+Shift+L
to automatically set the structure of the LEMMA
attribute. You can also write more than one lemma in the same rule by separating them with a comma (,
).
In the upper right corner of the GUI, select Build or press F6
to compile the project.
Pick a test file and open it in the editor, then go back to the upper-right corner of the GUI and select Analyze Document or press F5
.
Select the Categorization tool window at the bottom to check if the rule has triggered. Select the category to highlight the hits of all your rules referring to that category.
Alternatively, expand the category and select the rule's snippet you need.
Notice that the LEMMA
attribute also detects inflected forms of the words.
4
Apply your categorization rules to test documents and see results
In the upper right corner, select Analyze All Documents and choose a name for the analysis report.
Open the Report tool window and you will find your analysis report. Double-click it and you will see the list of the analyzed documents with information about the results.
Open a document in the editor by double-clicking it and check the categorization results in the Categorization tool window at the bottom.
Tips & tricks
Choose the proper score for your rule
Open a file containing categorization rules; change the score inside brackets by replacing NORMAL
(10 points) with LOW
(3 points) or HIGH
(15 points).
Select Build or press F6
, select a text file in the editor and select Analyze Document at the top-right of the GUI or press F5
. Select the Categorization tool window down below and have a look at the three different results.
Consider how the category score changes according to the different scores in the rules.
Change the score to HIGH
when the rule is particularly relevant for your category. Choose LOW
when the rule is relevant, yet not as relevant as the "normal" ones, so it should reach a considerable score only if it triggers more than once.