SCRIPT

Overview

SCRIPT transformation option allows changing output values with scripting functions. Both built-in or user-defined functions can be used.

Consider this example using the toUpper built-in function:

SCOPE SENTENCE
{
    IDENTIFY(TEST)
    {
        @FIELD1[LEMMA("twenty")|[SCRIPT("toUpper")]
    }
}

If the rule is applied to this input text:

Jane and Markus married twenty years ago

you will get this record:

Template: TEST

Field	Value
@FIELD1	TWENTY

The rule extracts TWENTY, uppercase, when extracted text originally was lowercase.

SCRIPT can be combined with the other transformation options with the plus (+) sign. Such a combination allows a sequential action of the transformers.

The syntax in an extraction rule is:

SCOPE scopeOption
{
    IDENTIFY(templateName)
    {
        @field[attribute]|[SCRIPT("function1 name[:parameter]" [, "function2 name[:parameter]" ...])]
    }
}

The syntax in a tagging rule is:

SCOPE scopeOption
{
    TAGGER(tagLevel)
    {
        @tag[attribute]|[SCRIPT("function1 name[:parameter]" [, "function2 name[:parameter]" ...])]
    }
}

where parameter is an optional parameter of the function.

Apart from those around attribute and SCRIPT, all the other square brackets indicate optional parts.

When more functions are specified, one function acts on the outcome of the previous and the final output is that coming out from the last function.

User-defined functions

User-defined functions must be defined in the main.jr file.
Their definition must have this syntax:

function name(tokenID, extraction, parameter)

The name of the parameters does not matter, but their position corresponds to their role and they must all be declared, even if not used in the body of the function.

In the first parameter the text intelligence engine, when it invokes the function during the extraction of the field, passes the ID of the text token it is examining. This, combined with the methods of the DIS pre-defined object, allows for sophisticated transformations based on the properties of the token.

In the second parameter the engine passes, as a string, the value extracted up to that moment.

In the third parameter, the engine passes the value of any parameter specified in the rule, after the name of the function and the colon. This allows for parametric transformations, calling the same function, but with different parameters based on the condition or rule.

The function must return a string, and the engine uses that return value as the new value of the current extraction.

Info

It is not possible to pass additional parameters directly to functions called within the SCRIPT transformation. To address this limitation, consider including extra information within the parameter string. Afterwards, you can split the string and manipulate its components according to your specific needs.

Built-in functions

These are the built-in functions that can be used with the SCRIPT transformation option:

toUpper
toLower
replaceString

toUpper

toUpper turns the extracted value into uppercase. Consider the example above to see how it works.

This function does not have a parameter.

toLower

toLower turns the extracted value into lowercase. Consider this example:

SCOPE SENTENCE
{
    IDENTIFY(TEST)
    {
        @FIELD1[KEYWORD("ROSE")]|[SCRIPT("toLower")]
    }
}

If this rule is applied to this text:

He bought me a ROSE.

you get this record:

Template: TEST

Field	Value
@FIELD1	rose

This function does not have a parameter.

replaceString

replaceString replaces all the occurrences of a string with another in the extracted value.

For example, if this rule:

SCOPE SENTENCE
{
    IDENTIFY(BUSINESS_STATS)
    {
        @LENGHT_OF_TIME[KEYWORD("qt")]|[SCRIPT("replaceString:qt|qtr")]
    }
}

is applied to this input text:

Profit is up 12% in 3rd qt.

you get the extraction of qtr instead of qt.

The replaceString function has a parameter, the syntax is:

replaceString:stringToReplace|replacementString