Model blocks: types and properties
Model types
These types of model can be used as blocks when designing a workflow:
Type | Block icon | Description |
---|---|---|
ML models | Models based on hybrid technology, combining a symbolic and an ML engine | |
Symbolic models | Purely symbolic models | |
Knowledge Models | Built-in inventory models | |
Custom components | Installation or customer specific components |
ML and symbolic models are created with the authoring application and made available to the Workflow application through publication or exported from the authoring application and then imported into the Workflow application. Available models are listed in the Components panel of the workflow editor or in the Models view of the main dashboard.
Block properties
General properties
The general properties of any model block are:
- The unique block ID.
- Block name.
- Description.
Only the block name can be changed.
Type specific properties
The type specific properties of a model block can be found in the Type Specific tab of the block properties window.
For Symbolic or Knowledge Models, the only property is Timeout, which is the execution timeout.
For ML models, the properties are:
- Symbolic Engine Timeout
- ML Engine Timeout
These are the execution timeouts of the two model components, symbolic and ML.
The timeout can be expressed in minutes (m) or seconds (s).
Functional options
Functional options can be found in the Functional tab of the block properties window and they vary according to the model type.
Note
In the following description, admitted values are indicated in round brackets next to each parameter name, with the possible default value in squares.
ML models
ML model blocks have the following functional parameters:
- ML Engine: propagate document content to output ([on]/off): make the model return the
content
output key. - ML Engine: output namespace (string): output keys for the ML step of the model may have a
namespace
property. This option allows overriding the default value of that property. - ML Engine: Output winner categories only (on/[off]): output only categories with property
winner
set totrue
. - Propagate symbolic engine output (on/[off]): when this option is turned on, the output of the symbolic step is included in the overall block output.
-
All these options:
- Apply rules
- Synchronize positions to original text
- Rules output namespace
- Output relevants
- Output sentiment
- Output relations
- Output dependency tree
- Output knowledge
- Output external ids
- Output rules extra data
- Output segments
affect to the symbolic step of the model and are the same of a purely symbolic model. They have an effect on the overall output of the model based on the Propagate symbolic engine output option (see above).
Symbolic models
The symbolic model type has the following properties:
- Synchronize positions to original text (on/[off]): model output can contain the positions—start, end—of elements in text, for example the position of parts of the text that triggered a category or the parts the correspond to extractions.
The input text to the model can be slightly changed—optimized—by the model itself before being analyzed: for example, sequences of new line characters can be reduced to one new line character or multiple consecutive space characters collapsed to one space character. By default, positions refer to the analyzed text, which can thus slightly differ from the original. Analyzed text is returned in output, so if positions are used to highlight parts of it, they are always accurate. This option must be turned on only if the user needs positions that refer to the original input text. When the option is turned on, the model rebases positions so that they are accurate for the original text, which means they could not be accurate for the analyzed text. - Apply rules ([on]/off): directs the model to apply categorization and extraction rules, which is the default behavior.
- Rules output namespace (string): some of the ouptput keys for a symbolic model have a
namespace
property. This option allows overriding the default value of that property. - Output relevants ([on]/off): make the model return the most important elements of the input text, that is the
topics
,mainSentences
,mainPhrases
,mainSyncons
andmainLemmas
output keys. - Output sentiment (on/[off]): make the model return the
sentiment
output key. - Output relations (on/[off]): make the model return the
relations
output key. - Output dependency tree (on/[off]): make the model return the
pos
,dependency
andmorphology
properties for each item of thetokens
output array. - Output knowledge (on/[off]): make the model return the
knowledge
output key. - Output external ids (on/[off]): inside the Knowledge Graph used by the model for the semantic analysis of the input text, concepts—called syncons—are identified by a unique number. This number is the value of the
syncon
property of the items of thetokens
output key.
Syncons have further identification numbers, so-called external (one or more) that are not shown by default in the model output.
This option determines the addition to the output, for each element of theknowledge
key, of theexternaIds
listing those external identifiers.
Activating the option is effective only if the Output knowledge option is also activated (see above). - Output rules extra data (on/[off]): make the model return the
extraData
output key. Note that not all symbolic models generate extra data, so the key can still be missing from the output even if this option is turned on. - Output segments (on/[off]): make the model return the
segments
output key.
Knowledge Models
Knowledge Models have the same functional properties of purely symbolic models.
Deployment properties
The deployment properties of a model block can be found in the Deployment tab of the block properties window.
They express the resources required by the model when the workflow runs in Platform runtime. In order for a workflow to be published, the runtime must have a free amount of resources at least equal to the sum of those required by all the models in the workflow.
For symbolic models and Knowledge Models the deployment parameters are:
- Replicas: number of required instances (3 maximum)
- Memory: required memory
- CPU: thousandths of a CPU required (for example: 2000 = 2 CPUs)
For ML models they are:
- Symbolic Engine Replicas
- Symbolic Engine Memory
- Symbolic Engine CPU
- ML Engine Replicas
- ML Engine Memory
- ML Engine CPU
IEC units of measures are used for memory: Ki (kibibyte), Mi (mebibyte), Gi (gibibyte).
Input properties
The need for mapping
The properties in the Input tab are used to map the output of the previous block to the input of the current block, therefore they determine the input to the model.
This mapping is not needed if the block is the first of a workflow, because in this case the block accepts as input the one submitted to the entire workflow through the API.
If there's no such mapping, a notification is displayed in the editor and the model block is surrounded by a dashed line.
This mapping is not needed if the block is the first of a workflow, because in this case the block accepts as its input whatever gets submitted to the entire workflow through the API.
text
The text property corresponds to the text that will be analyzed by the model and is a string. It is typically mapped to:
- The
modelName.document.content
property of a preceding model's output. - The
content
property of Tika Converter processor output (Tika Converter.content
) - The
content
property of URL Converter processor output (URL Converter.content
)
documentLayout
The documentLayout property is an object that indiriectly corresponds to text that will be analyzed by the model: the model itself produces the plain text to analyze by parsing the documentLayout object, taking the text values of some of its memmbers and concatenating them.
The object must have the same structure of the result
object of the output of the Extract Converter processor, so the typical mapping is to property (Extract Converter.result
) when the previous block in the workflow is such processor.
sections
The sections property is complementary to text and indicates the boundaries of sections of the text. It is an object like this:
"sections": [
{
"name": "TITLE",
"start": 0,
"end": 61
},
{
"name": "BODY",
"start": 62,
"end": 2407
}
]
where:
name
is the name of the section.start
is the zero-based position of the first character in the section inside the value oftext
.end
is the zero-based position of the first character after the section inside the value oftext
.
It is used in conjunction with the text property which represents the text to be analyzed while sections contains the boundaries of the sections that the model can consider if it has been programmed to do so. Currently only symbolic models built with Studio can account for sections.
The expected mapping with the previous block is with the modelName.document.sections
property of a model block which in turn received sections data from a previous block or from the request to the API.
sectionsText
The sectionsText property corresponds to text that will be analyzed by the model and is an object like this:
"sectionsText": [
{
"name": "TITLE",
"text": "This is a title"
},
{
"name": "BODY",
"text": "This is the body"
}
]
where:
name
is the name of the sectiontext
is the text of the section
Currently no standard block has its output or part of it mappable to this property, however it is possible to produce such an object using the Javascript operator.
The model builds plain text to analyze by concatenating the values of the text
properties of the sectionsText
array items using a newline character as a separator.
If the text property is also mapped, the text obtained from sectionsText
is appended to the one represented by text using a newline character as a separator, so the model receives a text that is the result of the concatenation of two texts. The model also receives automatically computed section boundaries referred to the concatenated text.
For example:
-
Value of
text
:We shall pay any price, bear any burden, meet any hardship, support any friend, oppose any foe to assure the survival and success of liberty.
-
Value of
sectionsText
:[ { "name": "TITLE", "text": "President John F. Kennedy delivered his inaugural address" } ]
-
Concatenated plain text:
We shall pay any price, bear any burden, meet any hardship, support any friend, oppose any foe to assure the survival and success of liberty. President John F. Kennedy delivered his inaugural address
-
Sections boundaries:
- Section name:
TITLE
- Start: 142
- End: 199
- Section name:
options
The options property is an object and corresponds to custom options that can be passed to the model to influence its behavior. Currently ony symbolic models produced with Studio are receptive to such options and can use them in their scripting.
No standard block has its output or part of it conceptually mappable to this property, however it is possible to produce such an object using the Javascript operator.
The options object must have a structure like this:
"options": {
"custom": {
...
}
}
The options to be passed to the model must be defined as properties of the nested object custom
.
To access options with Studio scripting, use the getOptions method of the predefined CTX
object. Specifically, scripting accesses options through the custom_options
property of the object returned by getOptions()
.
For example, if the object passed to the model and mapped to its options property is:
"options": {
"custom": {
"doSomethingSpecial": true
}
}
this code can access the doSomethingSpecial
option and take it into account:
var options = CTX.getOptions();
if(
options != null &&
options.custom_options != null &&
options.custom_options.doSomethingSpecial != null &&
options.custom_options.doSomethingSpecial
) { // Do something special }
Output
The models output is described in the next pages.