Templates and fields
Templates and fields make up the underlying structure of the extraction process.
The syntax to define a template is:
TEMPLATE(templateName)
{
@fieldName_1,
@fieldName_2,
...
@fieldName_n
}
where:
TEMPLATE
is a language keyword and must be written in uppercase.templateName
andfieldName_#
can be any sequence of alphanumerical characters and underscores (_
); accented letters and other punctuation marks are not allowed.
The template name must be unique within the project. All field names must be preceded by the at sign (@
).
Warning
- You can't use language keywords to identify templates and fields.
- Template and field names must not start with an underscore (
_
) or with a number, otherwise an error will occur.
Each template must contain at least one field. If a template contains several fields, these will have to be separated by a comma. It is not possible to have two fields with the same name within the same template; however, it will be possible to define two fields with the same name, if they are in different templates. Templates must be defined in the special Config.cr source file.
A template can be compared to a table with columns and rows, where fields correspond to the column headers.
Templates are data receptors meaning that their fields get filled with extracted data.
Every extraction project must have dedicated templates which reflect the requirements.
The table below represents a sample template filled with data.
Name | Telephone | Address |
---|---|---|
Jane Doe | 555-0199 | 123 Blue Street, New York |
John Smith | 020 7946 0123 | 456 Park Lane, London |
Its definition is:
TEMPLATE(PERSONAL_DATA)
{
@Name,
@Telephone,
@Address
}
This template is named PERSONAL_DATA and it contains three fields, @Name, @Telephone and @Address. It is like defining a table named PERSONAL_DATA with three columns named @Name, @Telephone and @Address.
Beyond the basic syntax, advanced options can be used to control and aggregate extracted data from various fields. Some are related to single fields while others impact the behavior of the entire template. Advanced options can aggregate data into complex records, while isolated values are generated by default.
Advanced options are:
- Field attributes: used to characterize some or all the fields in a template in a special way.
- Merge options: used to define if and when a template can aggregate all the extracted values and generate compound records.
It is possible to use one or more of these options at the same time in the same template.