Workflow and block input

Workflow input

The input to send to a workflow when it is executed is always a JSON object: how done is completely dependent on the structure and purpose of the workflow.

Since a workflow is a program that can theoretically do anything, let's step outside the common case of natural language processing for a moment and imagine workflows applied to home automation.
A workflow whose sole purpose is to connect to an home automation system and turn off the lights, does not need any input, so the JSON can be an empty object or any object, it makes no difference. A workflow that can turn the lights off or on, based on the wishes of the user, on the other hand, needs input and behaves differently depending on what the input contains.
With this input:

{
    "lights": "off"
}

it turns the lights off, while with this:

{
    "lights": "on"
}

it turns them on.

Since a workflow is made up of blocks, it is clear that it is the blocks that, depending on their nature, need input to operate.
Blocks needing input can take it from the JSON submitted to the workflow—the workflow input—or from the output of upstream blocks, as explained below.

Input expected by a block

Workflow blocks based on some standard components of the NL Flow inventory actually need input to operate.
In the part of this reference section of this manual dedicated to workflow components you will find, component by component, whether its blocks expect input and for what purposes.

Also depending on the component, the input can be made of data to be processed, options or commands that affect the behavior of the block, or a mix of the two. Even custom component blocks can have similar needs in terms of input. Script interpreter processor blocks are extremely flexible regarding their input: they can ignore it completely, only have code-level constraints on the content of the input or even have formal declaration of input variables.

Input variables

Blocks of some components have, by design or, in the case of script interpreters, because the user has defined them, input variables with a name and a type. Input variables correspond to variable value top-level keys of the input JSON that is passed to the block at execution time.
Which variables must be set depends on the block's operating logic. For example, a basic ML model block has a string variable called text for the text to be analyzed and a documentLayout object variable for text with graphic layout, but giving value to both doesn't make sense because they are mutually exclusive.

When editing a block, for many components, in the dialog that allows the user to manage the block's properties, an Input tab or similar can be found. It is in these points of the NL Flow GUI that users can consult and possibly configure the input variables of the block. In the case of script interpreter blocks, depending on the version, and for the Map operator, it's up to the user to define the input variables contextually to their mapping.

In this reference section of this manual you will find the description of all the input variables and any valid combinations for each component.

Who builds the input for the blocks?

The actual input submitted to a workflow is established by whoever uses the workflow through its API or by whoever runs interactive tests.
The input to the single workflow block, instead, is built on the fly by the workflow orchestration service, which is an essential part of every NL Flow runtime. Depending on the block configuration (see below) the input to a block can be taken from the input to the workflow, from the output of one or more previously executed blocks or be a collage of data from different sources.

Connection = output-input relationship?

If the workflow author connects block X to block Y does not automatically mean that the output of block X will become the input of block Y when the workflow is executed: the actual input to block B depends on its component and possibly on the configuration of its input properties.
The connection and its direction—where the arrow starts and ends—determines the order of execution of the blocks (first X, then, immediately after, Y), but not necessarily the relationship between the output of X and the input of Y. Block Y can have an input that has a completely different origin from X and even a mix of origins.
For blocks with input variables, the actual input JSON built on the fly by the workflow orchestrator depends on how the block has been configured, in particular on how and to what its input variables have been mapped.

However, the editor software doesn't discard the possibility that a block must "simply" receive its input from the previous block, in fact, if block Y has input variables, the editor always tries automatic mapping.
Also, the blocks of some components, like the Join operator, do not have input variables and take their input from the output of all the upstream blocks directly connected to it.

Block input properties

A block's input properties correspond to its input variables. When implicit mapping does not apply, the input properties need to be configured via explict mapping.

Implicit mapping

If you find a read-only list of input variables when editing a block, it means that no configuration is needed. This happens in a very specific situation, namely when the block is the first with input variables in a flow and the structure of the workflow input has not been formally described to NL Flow.
In this case, the workflow orchestrator will pass the workflow JSON input to the block, verbatim, and the block makes an implicit mapping, taking as its input the top-level keys of the JSON that have the same name and type of a valid combination of its input variables.

Explicit mapping

If a block with input variables is preceded by other blocks with input variables or if the structure of the workflow's input has been explicitly described, then it is necessary to configure the input properties. Configuring the input properties means "explaining" how (that is, with what) to give a value to the input variables of the block at execution time. This is achieved by mapping the input variables to the data that is expected to be present either in the output of upstream blocks or in the input to the workflow.

Block output manifest

One of the possible sources of the value of an input variable of a block is any top-level key of the output JSON of a block that is upstream in the flow.
All components that can produce outputs suitable for mapping have a declaration, or manifest, of the structure of their output accessible to the editor, that is the list of top-level keys that can be in their output JSON, with an indication of the name and type. So when adding or editing a block with input variables, the knowledge of the possible output of directly or indirectly connected upstream block allows for automatic, semi-automatic and manual mapping of input variables.

Workflow input as a pseudo output

Possible issues with workflow's input

By default, only the first block with input variables in a flow has its input variables implicitly mapped to the workflow input, while other blocks have no visibility into it.
This can lead to the following issues:

The JSON input can contain ambiguous or unexpectedly named top-level keys for the first block with input variables of one or more of the flows. For example, the block has input variable A, string, but the JSON input contains a string top-level key called B, or the JSON input contains A+B and there are two flows, one that starts with a block with input variable A and the other that starts with a block with input variable B, but one or both blocks, at execution time, do not tolerate the presence of more input (B or A) than expected, and fail.
A block deep in a flow may need data present in the input to the workflow. By default the block has no direct visibility into the input to the workflow, so it is necessary to implement workarounds, such as script interpreter type blocks at the beginning of the flow with the sole purpose of echoing their input, so that their output is then mappable to the input variables of the blocks downstream.

Solution

The above issues can be solved by providing NL Flow with the explicit description of the workflow's input JSON, that is the listing of the top-level keys of interest contained in the JSON.
By so doing, a virtual block called $nlflow_input, located upstream of the workflow and invisible in its diagram, gets defined. It will be available when the workflow is run and its pseudo output will contain only the listed keys.
The virtual block is considered as upstream of any other block, so it allows mapping its "output" keys to the input properties of any block in the diagram, thus "disclosing" the contents of the workflow's input JSON to every block which could make use of it. This also determines that the first block with input variables in a flow needs its input properties to be configured, they are not read-only anymore.

So the solutions to the above issues are:

By mapping the top-level keys of the $nlflow_input object, you can associate only the "correct" ones to the block's input variables. This way, at runtime, the workflow orchestrator service will package an input JSON for the block with only those keys, without ambiguity. In the case of name mismatches, you can also simply map the top-level key A of the $nlflow_input object to the input variable B of the block and the workflow orchestrator will package an input JSON in which the key A is renamed B.
For the in-depth block that needs data from the workflow's input JSON, you can map input variables to output keys of the $nlflow_input virtual block.

Automatic mapping

When you connect a block to another block that has input variables, the editor attempts to automatically configure the input properties of the downstream block by mapping its input variables to top-level keys of the declared output of the upstream block.
Automatic mapping is based on the output manifest of the upstream block: thanks to the manifest, the editor automatically "knows" the name and the type of the top-level keys of the output JSON of the upstream block. If there is a name plus type match with the input variables of the downstream block, the configuration takes place.
For example, if you connect a block of component X, which has output composed of the top-level keys A (string) and B (boolean) to a block of component Y that has input variables A (string) and B (boolean), the editor automatically configures the input properties of the block of type Y by mapping the two input variables to the two top-level keys of the block of type X that have the same name and type.

Even if successful, automatic mapping may not be what the user wants, so the editor always notifies of auto-mapping, and the user can change the mapping later by editing the block.

Semi-automatic mapping with assistant

When the conditions for automatic mapping are not met, it is sometimes possible for the user to try again to ask the editor to perform the mapping using an assistant, symbolized by a magic wand icon. In this case, the assistant searches for matches between the top-level keys of the source block's output JSON and the input variables of the destination block at the data type level only. For example, if you connect a block of component X, which has output composed of the top-level keys U (string) and B (boolean), to a block of component Y that has an input variable A (string), the editor will associate the key U with the variable A. If, on the other hand, the source block produces more than one string key, the assistant cannot decide independently which of the two to map.

The semi-automatic mapping with assistant is at the discretion of the user, who can decide not to use it and proceed with manual mapping. If done, it is always editable later.

Manual mapping

Manual mapping consists of associating the input variables that the users want to value when executing the block to top-level keys of the output JSON of upstream blocks or, if defined, of the $nlflow_input virtual block that represents the workflow input.
Using the components' output manifests, the editor can present the user with drop-down menus listing all the suitable top-level keys.
If an input variable is of type object, it is also possible to map it to the entire output of a previous block of to the $nlflow_input object, because the output of any block in a JSON object by definition.