Workflow and block input
Workflow input
The input submitted to a workflow must be a JSON object and is set by whoever uses the workflow, be it through its API or with the NL Flow test GUI.
The principle to follow when crafting that JSON is: it must contain all the client-side data the workflow needs to produce the expected result. These can be the document to analyze, but also options and metadata.
The structure of the JSON must be compatible with the workflow, but this doesn't mean that it must be strictly compatible with the initial block of each possible flow: by defining the $nlflow_input pseudo-block it becomes possible to make the initial block of a flow "understand" a JSON apparently incompatible with it and also to make all or part of the JSON reach blocks that are not the first of the flows.
Block input
Input variables and the input manifest
Overview
Input variables are data that a block expects at runtime.
For example:
- A model block expects a text to analyze.
- A TikaTesseract Converter block expects a file to extract text from.
- A Switch block expects something telling it which conditional downstream connection the flow must follow.
- A PDF Splitter block expects a PDF to split.
and so on.
Input variables have a name, with one possible exception when they are of type object, and a type that can be boolean, number, string, object or array.
Predefined, component-level variables
All these standard components:
- Models
- Processors, excluding JavaScript Interpreter and Python Interpreter
- The Switch operator
- Splitters
- Reducers, excluding Simple Reducer.
have predefined input variables.
In the articles about the components in this section of the manual you will find the description of all component-level input variables.
Custom components can have input variables too, so for example the block of a custom component using a generative AI model expects a prompt to submit to the model.
When a component has input variables, every block of it inherits them.
For example, every TikaTesseract Converter block has a base64
input variable that is a string and must be set at runtime with the Base64 encoding of the bytes of the file from which the block must try to extract text.
Component-level input variables can be required, meaning they must be set at runtime, or optional.
For example, the base64
input variable of a TikaTesseract Converter block is required.
Optional variables can be implicitly required when:
- No input variable if formally required because there are alternative variables to choose from, like, in a model block,
documentLayout
andtext
: one, and only one, of the two must be set to provide the text to analyze to the model. - One variable depends on another. For example, in a model block,
sections
depends ontext
.
Input variables for a component are listed and described in a manifest that is visible inside the workflow editor, in particular:
- In the components guide, when hovering over a component in the inventory, under the Input tab.
- In the properties pop-up of a block under the Input tab.
User-defined, block level variables
There are components that don't have input variables, but their blocks must or can have them.
The Map operator, Simple Reducer and Simple Remapper, because of their purpose, cannot have predefined input, it's the designer who must define the input variables he deems useful at the block level.
Other components of this kind are the script interpreters for JavaScript and Python. Their blocks receive the implicit input—and version 1.0.0 of the JavaScript Interpreter allows only that—, but if input mapping is required, the designer can define input variables at the block level.
Do all blocks need an input?
Not all blocks of a workflow need an input. For example, operators like Fork and Tunnel only affect the flow, they don't have an input and do not produce an output.
Who builds the input for the blocks and how?
At runtime, the input to a block of the workflow needing it is built by the workflow orchestration service, which is an essential part of every NL Flow runtime.
The workflow orchestration service builds the input JSON following these rules:
- If the block has input properties, the input JSON is composed according to them. The JSON contains a top-level key for each input property that has been set and the name of each key is the name of the corresponding input variable. This way the block simply picks the value for an input variable from the top-level key with the same name inside the input JSON.
- If the block doesn't have input properties and is the first in a flow, the input coincides with the workflow's input. This is one case of implicit input.
- If the block doesn't have input properties and is not the first in a flow, the input is the whole output of the block immediately preceding it in the flow.
If the block is at the end of multiple connections, it is as if it's preceded only by a Join block and the connections starting from multiple blocks all end in the Join block, so it receives the output of the invisible Join block.
This is the other case of implicit input.
Block input properties
A block's input properties provide instructions to the workflow orchestration service on how to set input variables at runtime (see rule 1 above).
Blocks of components with input variables have corresponding editable input properties if they are not the first in a flow or if the $nlflow_input pseudo-block has been defined.
Setting input properties is mandatory, but not every property may need to be set, some properties can be optional. Which properties to set depends on their being required and on designer's choices.
Once set, an input property is a mapping between a key of the output of an upstream block and an input variable, so input properties are set through input mapping.
Implicit input
If you find a read-only list of input variables in the Input tab when editing a block, it means that the block's component has input variables, but the block doesn't have editable input properties. This happens in a very specific situation, namely when the block is the first in a flow and there's no upstream $nlflow_input pseudo-block.
At runtime, the workflow orchestration service will apply rule 2 to build the input JSON. The workflow orchestration service will pass the workflow's input JSON to the block, verbatim, and the block will look inside that JSON for top-level keys with the same name and type of a valid combination of its input variables. If required keys are missing or there are conflicting keys, an error will occur.
Another case of implicit input is that of script interpreter blocks for which the designer cannot or chooses not to define input variables.
Without input properties, the workflow orchestration service applies rule 2 or rule 3 to build the input JSON for the block at runtime. The JavaScript or Python custom code of the block will find an equivalent of the input JSON inside the only parameter of the process
function.
A block of Join and End Switch operators doesn't have input properties and its implicit input is the output of the flows that converge in the block.
Input mapping
Whenever a block has input properties, the designer must set them in a valid an purposeful manner. This is done in the workflow editor via input mapping, that is pairing the "right" input variables with the "right" top-level keys that are expected to be present in the output JSON of upstream blocks.
At runtime, the presence of input properties causes the workflow orchestration service to apply rule 1 to build the block's input JSON and the values of the input properties—that is the mappings—tell it exactly which keys to include in the JSON and how to name them.
To know how to perform input mapping, read the article dedicated to the topic.
Workflow input as a pseudo output
Possible issues with workflow's input
By default, the workflow's input is the implicit input to the first blocks of any flow and only to those. This can lead to the following issues:
- Input cannot be found: the first block of the flow has input variable A, string, but the workflows's input JSON contains a string top-level key called B. This causes a runtime error.
- Ambiguous input: the first block of the flow has input variables A and B which are mutually exclusive and workflow's input contains both. This causes a runtime error.
- Unwanted input: the first block has input variable A and doesn't tolerate the presence of other top-level keys in the workflow's input which, however, also contains top-level key B. This causes a runtime error.
- Limited visibility: a block deep in a flow needs data present in the workflow's input, but has no visibility of it.
Solution
The designer can solve the above issues by creating an invisible block that is upstream of any other block in the workflow and whose output is a selection of the top-level keys of the workflow's input JSON. Once this pseudo-block, named $nlflow_input, has been defined, its output keys can be mapped to the input variables of any block, no matter how deep in a flow, also because the first block of any flow, if its component has input variables, gets editable input properties, no more implicit input for those blocks.
So the solutions to the above issues are:
- Input cannot be found: map the A input variable to the B key of the $nlflow_input object.
- Ambiguous input: decide which input variable to set at runtime and map only that to the corresponding key of the $nlflow_input object.
- Unwanted input: map only the A input variable to the A key of the $nlflow_input object.
- Limited visibility: map input variable of the deep block to keys of the $nlflow_input object.
To know how to create the $nlflow_input pseudo-block read the article dedicated to the topic.
Automatic mapping
When the designer connects a block to another block that has input variables, the editor attempts to automatically set the input properties of the second block by mapping input variables to top-level keys of the declared output of the first block: this is automatic input mapping.
Automatic input mapping is based on the output manifest of the first block: thanks to the manifest, the editor automatically "knows" the name and the type of the top-level keys of the output JSON of the block. If there is a name-and-type match with the input variables of the second block, the mapping is created.
For example, if a block of component X, which has output keys A (string) and B (boolean) is connected to a block of component Y that has input variables A (string) and B (boolean), the editor automatically sets the input properties of the block of type Y by mapping the two input variables to the two top-level keys of the block of type X that have the same name and type.
Even if successful, automatic mapping may not be what the designer wants, so he can always change it later by editing the block.
Semi-automatic mapping with assistant
When the conditions for automatic mapping are not met, it is sometimes possible for the designer to make a second attempt using the mapping assistant symbolized by a magic wand icon in Input tab of the block properties dialog.
In this case, the mapping assistant searches for unambiguous type-only matches between the top-level keys of the first block's output manifest and the input variables of the second block.
For example, if a block of component X, which has output key A (string), is connected to a block of component Y that has an input variable B (string), the editor will associate the key A with variable B because they have the same type and there are no other output keys of the same type.
If, on the other hand, the first block may produce more output keys of type string, the assistant cannot choose which to use and mapping cannot take place.
Semi-automatic input mapping with the assistant is at the discretion of the designer, who can decide not to use it and proceed with manual mapping. Also, as for automatic mapping, it can always be changed later.
Manual mapping
Manual input mapping is done by the designer according to his wishes. He sets by hand the input properties for the input variables that he want to be set at runtime by choosing suitable keys of upstream blocks including, if defined, the $nlflow_input pseudo-block. Upstream block exposte their output keys thorugh their output manifests.
In the case of the Map operator, Simple Reducer and Simple Remapper, the input variables are not predefined at the component level and the designer must create them before being able to map them to output keys of upstream blocks.
In case of a script interpreter (but not version 1.0.0 of the JavaScript Interpreter), the designer may create them when implicit input is not adequate.