The text intelligence engines you develop with Studio may have to process structured documents divided into sections. For example, emails can have these sections:
- "To" recipients
- "CC" recipients
Test files in Studio contain plain text simulating the text extracted from the documents that the engine will have to analyze.
In general, this text does not contain any information that relays where, in the original document, sections begin and end. However, if you have, for each of the test files, a special file which indicates the position of the sections in the text, Studio takes this into account when using the Analyze command and the rules that refer to the sections behave as expected.
One way to create these files containing the position of the sections is manual annotation. Do the following:
Declare the sections you expect using the
SECTIONSstatement of the Rules language.
The default section is
BODY, which, by definition, cannot be annotated—all text is
BODYif not specified otherwise—so you can only annotate any other sections.
Open the test file you want to annotate in the editor.
- Select the section text.
- Right-click on the selection and choose Annotate Section > SECTION.