Model blocks: types and properties

Model types

These types of model can be used as blocks when designing a workflow:

Type	Block icon	Description
ML models		Models based on hybrid technology, combining a symbolic and an ML engine
Symbolic models		Purely symbolic models
Knowledge Models		Built-in inventory models
Custom components		Installation or customer specific components

ML and symbolic models are created with the authoring application and made available to the Workflow application through publication or exported from the authoring application and then imported into the Workflow application. Available models are listed in the Components panel of the workflow editor or in the Models view of the main dashboard.

Block properties

General properties

The general properties of any model block are:

The unique block ID.
Block name.
Description.

Only the block name can be changed.

Type specific properties

The type specific properties of a model block can be found in the Type Specific tab of the block properties window.

For Symbolic or Knowledge Models, the only property is Timeout, which is the execution timeout.

For ML models, the properties are:

Symbolic Engine Timeout
ML Engine Timeout

These are the execution timeouts of the two model components, symbolic and ML.

The timeout can be expressed in minutes (m) or seconds (s).

Functional options

Functional options can be found in the Functional tab of the block properties window and they vary according to the model type.

Note

In the following description, admitted values are indicated in round brackets next to each parameter name, with the possible default value in squares.

ML models

ML model blocks have the following functional parameters:

ML Engine: propagate document content to output ([on]/off): make the model return the content output key.
ML Engine: output namespace (string): output keys for the ML step of the model may have a namespace property. This option allows overriding the default value of that property.
ML Engine: Output winner categories only (on/[off]): output only categories with property winner set to true.
Propagate symbolic engine output (on/[off]): when this option is turned on, the output of the symbolic step is included in the overall block output.
All these options:
- Apply rules
- Synchronize positions to original text
- Rules output namespace
- Output relevants
- Output sentiment
- Output relations
- Output dependency tree
- Output knowledge
- Output external ids
- Output rules extra data
- Output segments
affect to the symbolic step of the model and are the same of a purely symbolic model. They have an effect on the overall output of the model based on the Propagate symbolic engine output option (see above).

Symbolic models

The symbolic model type has the following properties:

Synchronize positions to original text (on/[off]): model output can contain the positions—start, end—of elements in text, for example the position of parts of the text that triggered a category or the parts the correspond to extractions.
The input text to the model can be slightly changed—optimized—by the model itself before being analyzed: for example, sequences of new line characters can be reduced to one new line character or multiple consecutive space characters collapsed to one space character. By default, positions refer to the analyzed text, which can thus slightly differ from the original. Analyzed text is returned in output, so if positions are used to highlight parts of it, they are always accurate. This option must be turned on only if the user needs positions that refer to the original input text. When the option is turned on, the model rebases positions so that they are accurate for the original text, which means they could not be accurate for the analyzed text.
Apply rules ([on]/off): directs the model to apply categorization and extraction rules, which is the default behavior.
Rules output namespace (string): some of the ouptput keys for a symbolic model have a namespace property. This option allows overriding the default value of that property.
Output relevants ([on]/off): make the model return the most important elements of the input text, that is the topics, mainSentences, mainPhrases, mainSyncons and mainLemmas output keys.
Output sentiment (on/[off]): make the model return the sentiment output key.
Output relations (on/[off]): make the model return the relations output key.
Output dependency tree (on/[off]): make the model return the pos, dependency and morphology properties for each item of the tokens output array.
Output knowledge (on/[off]): make the model return the knowledge output key.
Output external ids (on/[off]): inside the Knowledge Graph used by the model for the semantic analysis of the input text, concepts—called syncons—are identified by a unique number. This number is the value of the syncon property of the items of the tokens output key.
Syncons have further identification numbers, so-called external (one or more) that are not shown by default in the model output.
This option determines the addition to the output, for each element of the knowledge key, of the externaIds listing those external identifiers.
Activating the option is effective only if the Output knowledge option is also activated (see above).
Output rules extra data (on/[off]): make the model return the extraData output key. Note that not all symbolic models generate extra data, so the key can still be missing from the output even if this option is turned on.
Output segments (on/[off]): make the model return the segments output key.

Knowledge Models

Knowledge Models have the same functional properties of purely symbolic models.

Deployment properties

The deployment properties of a model block can be found in the Deployment tab of the block properties window.

They express the resources required by the model when the workflow runs in Platform runtime. In order for a workflow to be published, the runtime must have a free amount of resources at least equal to the sum of those required by all the models in the workflow.

For symbolic models and Knowledge Models the deployment parameters are:

Replicas: number of required instances (3 maximum)
Memory: required memory
CPU: thousandths of a CPU required (for example: 2000 = 2 CPUs)

For ML models they are:

Symbolic Engine Replicas
Symbolic Engine Memory
Symbolic Engine CPU
ML Engine Replicas
ML Engine Memory
ML Engine CPU

IEC units of measures are used for memory: Ki (kibibyte), Mi (mebibyte), Gi (gibibyte).

Input properties

The need for mapping

The properties in the Input tab are used to map the output of the previous block to the input of the current block, therefore they determine the input to the model.
This mapping is not needed if the block is the first of a workflow, because in this case the block accepts as input the one submitted to the entire workflow through the API.

If there's no such mapping, a notification is displayed in the editor and the model block is surrounded by a dashed line.

This mapping is not needed if the block is the first of a workflow, because in this case the block accepts as its input whatever gets submitted to the entire workflow through the API.

text

The text property corresponds to the text that will be analyzed by the model and is a string. It is typically mapped to:

The modelName.document.content property of a preceding model's output.
The content property of Tika Converter processor output (Tika Converter.content)
The content property of URL Converter processor output (URL Converter.content)

documentLayout

The documentLayout property is an object that indiriectly corresponds to text that will be analyzed by the model: the model itself produces the plain text to analyze by parsing the documentLayout object, taking the text values of some of its memmbers and concatenating them.
The object must have the same structure of the result object of the output of the Extract Converter processor, so the typical mapping is to property (Extract Converter.result) when the previous block in the workflow is such processor.

sections

The sections property is complementary to text and indicates the boundaries of sections of the text. It is an object like this:

"sections": [
    {
        "name": "TITLE",
        "start": 0,
        "end": 61
    },
    {
        "name": "BODY",
        "start": 62,
        "end": 2407
    }
]

where:

name is the name of the section.
start is the zero-based position of the first character in the section inside the value of text.
end is the zero-based position of the first character after the section inside the value of text.

It is used in conjunction with the text property which represents the text to be analyzed while sections contains the boundaries of the sections that the model can consider if it has been programmed to do so. Currently only symbolic models built with Studio can account for sections.
The expected mapping with the previous block is with the modelName.document.sections property of a model block which in turn received sections data from a previous block or from the request to the API.

sectionsText

The sectionsText property corresponds to text that will be analyzed by the model and is an object like this:

"sectionsText": [
    {
        "name": "TITLE",
        "text": "This is a title"
    },
    {
        "name": "BODY",
        "text": "This is the body"
    }
]

where:

name is the name of the section
text is the text of the section

Currently no standard block has its output or part of it mappable to this property, however it is possible to produce such an object using the Javascript operator.

The model builds plain text to analyze by concatenating the values of the text properties of the sectionsText array items using a newline character as a separator.
If the text property is also mapped, the text obtained from sectionsText is appended to the one represented by text using a newline character as a separator, so the model receives a text that is the result of the concatenation of two texts. The model also receives automatically computed section boundaries referred to the concatenated text.

For example:

Value of text:

We shall pay any price, bear any burden, meet any hardship, support any friend, oppose any foe to assure the survival and success of liberty.

Value of sectionsText:

[
    {
        "name": "TITLE",
        "text": "President John F. Kennedy delivered his inaugural address"
    }
]

Concatenated plain text:

We shall pay any price, bear any burden, meet any hardship, support any friend, oppose any foe to assure the survival and success of liberty.
President John F. Kennedy delivered his inaugural address

Sections boundaries:
- Section name: TITLE
- Start: 142
- End: 199

options

The options property is an object and corresponds to custom options that can be passed to the model to influence its behavior. Currently ony symbolic models produced with Studio are receptive to such options and can use them in their scripting.
No standard block has its output or part of it conceptually mappable to this property, however it is possible to produce such an object using the Javascript operator.

The options object must have a structure like this:

"options": {
    "custom": {
        ...
    }
}

The options to be passed to the model must be defined as properties of the nested object custom.
To access options with Studio scripting, use the getOptions method of the predefined CTX object. Specifically, scripting accesses options through the custom_options property of the object returned by getOptions().

For example, if the object passed to the model and mapped to its options property is:

"options": {
    "custom": {
        "doSomethingSpecial": true
    }
}

this code can access the doSomethingSpecial option and take it into account:

var options = CTX.getOptions();

if(
    options != null &&
    options.custom_options != null &&
    options.custom_options.doSomethingSpecial != null &&
    options.custom_options.doSomethingSpecial
) { // Do something special }

Output

The models output is described in the next pages.