Modules overview
What are modules?
As stated in the scripting overview, the main.jr file is where the text intelligence engine expects to find a script that allows extending the document analysis pipeline. When executing the pipeline, the engine triggers the event handlers defined in that file and those event handlers can use other functions and variables defined within the same file.
It is also possible to use code written in other files given they have the .jr extension and their code is adequately structured. These additional files are called modules.
A module is roughly comparable to a class of objects.
Module objects are instantiated with this syntax:
var modVar = require(modulePath)
where modulePath
is the path of the module's .jr file with respect to the project folder and without the extension.
Variable modVar
becomes an instance of the module.
For example, this statement:
var sphere1 = require("modules/sphere");
makes variable sphere1
an instance of module sphere which corresponds to the sphere.jr file that has to be found in a sub-folder named modules inside the project folder.
Note
The use of require()
is not limited to the main.jr file, it can also be used in a module to instantiate other modules.
Methods and properties
By default, functions, variables and constants defined within a module have a local scope, which means they are not accessible to code outside the module.
The functions, variables and constants that you want to make accessible from outside the module must be "exported" and this is achieved by assigning them to as many properties of the predefined exports
object.
In this way, functions become methods while variables and constants become properties of objects instantiated with the require()
function.
To illustrate this, let's use the example of a geometric object, the sphere.
What we want is a module whose instances represent spheres, with a "create" method working as a class constructor having the radius of the sphere as its only parameter.
We want every sphere to expose the values of its surface and its volume through object's properties.
The module file could be called sphere.jr, be located in the modules sub-folder of the project and have these contents:
var localSurface;
var localVolume;
function surfaceFunction(radius) {
return 4 * Math.PI * Math.pow(radius, 2);
}
function volumeFunction(radius) {
return 4 / 3 * Math.PI * Math.pow(radius, 3);
}
function createFunction(radius) {
localSurface = surfaceFunction(radius);
localVolume = volumeFunction(radius);
exports.surface = localSurface;
exports.volume = localVolume;
}
exports.create = createFunction;
The createFunction
function "creates" a sphere with the given radius by computing its surface and its volume. The results of these computations are first stored in two local variables that are then exposed as module—and therefore, object—properties using the exports
object. Eventually the createFunction
function is exposed as a module method.
export.surface
, exports.volume
and exports.create
are properties of the exports
object defined on the fly. The first two expose internal variables localSurface
and localVolume
as public properties surface
and volume
, while the third exposes function createFunction
as the create()
method.
Neither the surfaceFunction
and the volumeFunction
functions nor the localSurface
and the localVolume
variables are visible from outside.
The module can be used like this inside main.jr:
var sphere1 = require("modules/sphere");
sphere1.create(7.5);
CONSOLE.log('The surface of a sphere with a radius of 7.5 is ' + sphere1.surface + ' and the volume is ' + sphere1.volume);
As stated above, also constants can become public properties of a module using this syntax:
exports.property = constant
Studio ready-to-use modules
You are free to write custom modules, but first consider the following modules, which are predefined in Studio so they can be copied and used in any project:
Module | Functionalities |
---|---|
dompost | Post-processing of output categories' labels |
jmespath | Parsing and navigation of JSON objects based on the JMESPath query language |
jsonpath | Parsing and navigation of JSON objects based on the JSONPath query language |
jsonPlug | Free style manipulation of results |
linkPost | aggregation of fields from different records and record validation |
mergepost | Merger of extraction records |
moment | Parsing, validation and manipulation of dates and times |
normalizepost | Normalization of extracted values |
regexcleaner | Advanced find-and-replace operations based on regular expressions |
tagHierarchy | Hierarchy between strong and weak tags |
These modules are the transposition of open source projects:
- moment: https://github.com/moment/moment
- jmespath: https://github.com/jmespath/jmespath.js/
- jsonpath: https://github.com/dchester/jsonpath
and expose the same functionalities of their GitHub counterparts. The other modules are described in detail in the following articles in this section.
You can manage your modules with the JR Modules Manager.