queryJsonPath
Standard output types
The queryJsonPath method is used to extract values matched by a JSONPath (or a series of absolute/relative JSONPaths ), saving them in various formats.
This method complements other jsonPlug actions by enabling enhanced validation and the identification of complex items that demand advanced querying for subsequent scripting activities. It leverages the custom relative-node parsing feature seamlessly integrated into the JSONPath language by the script.
For example, consider this template:
TEMPLATE(CHARACTERS)
{
@CHARACTER_NAME,
@CHARACTER_NICKNAME,
@CHARACTER_DATE_OF_BIRTH
}
If these rules:
SCOPE SENTENCE
{
IDENTIFY(CHARACTERS)
{
@CHARACTER_NAME[TYPE(NPH)]
<>
@CHARACTER_NICKNAME[KEYWORD("spider-man")]
<>
@CHARACTER_DATE_OF_BIRTH[TYPE(DAT)]
}
IDENTIFY(CHARACTERS)
{
@CHARACTER_NAME[TYPE(NPH)]
<>
@CHARACTER_NICKNAME[KEYWORD("peter parker")]
<>
@CHARACTER_DATE_OF_BIRTH[TYPE(DAT)]
}
}
are applied to this input text:
Peter Parker, known as Spider-Man, was created by Stan Lee and Steve Ditko in 1962.
Miles Morales, known as the modern Peter Parker, was created by Brian Michael Bendis and Sara Pichelli in 2011.
you will get these records:
Template: CHARACTERS
Field | Value |
---|---|
@CHARACTER_NICKNAME | Spider-Man |
@CHARACTER_NAME | Peter Parker |
@CHARACTER_DATE_OF_BIRTH | 1962 |
Template: CHARACTERS
Field | Value |
---|---|
@CHARACTER_NICKNAME | Peter Parker |
@CHARACTER_NAME | Miles Morales |
@CHARACTER_DATE_OF_BIRTH | 2011 |
With this code:
function onFinalize(result) {
var nickname_values = jsonPlug.queryJsonPath(result, {
jsPath: "$.match_info.rules.extraction[?(@.template == 'CHARACTERS')].fields[?(@.field == 'CHARACTER_NAME')].value",
outputType: "regex",
modifierFlag: true
});
jsonPlug.jsonPlug(result, {
action: "delete",
jsPathConditionFlag: true,
jsPath: "$.match_info.rules.extraction[?(@.template == 'CHARACTERS')].fields[?(@.field == 'CHARACTER_NICKNAME' && " + nickname_values + ".test(@.value) )]",
jsPathAction: "#this#",
recursive: true
});
return result;
}
you will get these records:
Template: CHARACTERS
Field | Value |
---|---|
@CHARACTER_NICKNAME | Spider-Man |
@CHARACTER_NAME | Peter Parker |
@CHARACTER_DATE_OF_BIRTH | 1962 |
Template: CHARACTERS
Field | Value |
---|---|
@CHARACTER_NAME | Miles Morales |
@CHARACTER_DATE_OF_BIRTH | 2011 |
As you can see, the nickname_values variable has been populated with all the values of the CHARACTER_NAME field of the CHARACTERS template from the JSONPath.
The delete action of the jsonPlug
method below the variable definition allows you to delete the CHARACTER_NICKNAME field if its value is matched by the regular expression generated in the variable.
With this other example:
function onFinalize(result) {
var nickname_values = jsonPlug.queryJsonPath(result, {
jsPath: "$.match_info.rules.extraction[?(@.template == 'CHARACTERS')].fields[?(@.field == 'CHARACTER_NAME')].value",
outputType: "regex",
modifierFlag: true
});
var character_has_conflicting_nickname = jsonPlug.queryJsonPath(result, {
jsPath: "$.match_info.rules.extraction[?(@.template == 'CHARACTERS')].fields[?(@.field == 'CHARACTER_NICKNAME' && " + nickname_values + ".test(@.value) )]",
outputType: "boolean"
});
CONSOLE.log(character_has_conflicting_nickname);
}
you will get the boolean value of true as output in the Output tab of the Console tool window, because the value of the CHARACTER_NICKNAME field of the second sentence is equal to the CHARACTER_NAME value of the first sentence. In case of no match, a value of false would have been reported.
The syntax of the queryJsonPath method is:
moduleVariable.queryJsonPath(result, {
jsPath: parameterValue,
outputType: parameterValue,
modifierFlag: parameterValue
})
where parameterValue
is one of the possible values of the corresponding parameters described below:
moduleVariable
is the variable corresponding to the module and set withrequire()
.result
is the object containing the analysis results.-
jsPath
is the JSONPath expression that determines the nodes to which the action must be applied. It can be:- A standard JSONPath.
Or:
- An array of multiple JSONPaths (see modify).
-
outputType
is the output type, it can be:array
: an array containing all the matched value(s) by the JSONPath query.string
: the value(s) matched by the JSONPath query are concatenated into a single string (elements are separated by a whitespace).regex
: the value(s) matched by the JSONPath query are turned into a regular expression.object
: the value(s) matched by the JSONPath query are turned into an object (each property will have a value oftrue
).boolean
: boolean value oftrue
in case of a match,false
otherwise.count
: returns an integer representing the number of matched items.paths
: contains advanced query information for further script manipulations, in a custom format optimized for Studio's output model (see below).standard paths
: same output as the standardpaths
method of the official JSONPath library.nodes
: same output as the standardnodes
method of the official JSONPath library.regex object
: (to be used with extractions only) an object containing several regular expressions used for sophisticated jsonPath validations involving aggregated data (see below).
-
modifierFlag
is a flag that:- If you select
array
,string
orcount
asoutputType
and the flag is set tofalse
, duplicates will not be removed from the matched items (which is the default behavior). - If you select
regex
asoutputType
and the flag is set totrue
, it will make the regular expression case insensitive.
- If you select
Alternatively, you can use this syntax:
moduleVariable.queryJsonPath(result, jsPath, outputType, modifierFlag)
Note
- The parameters in the second syntax must be declared in this exact order.
- Both syntaxes can be used interchangeably.
'paths' output type
If outputType is set to paths
, the output array will contain an amount of objects equal to the number of items matched by the JSONPath(s) in the result
object.
The format of these objects will change according to the matched items:
- for a record or a field in the
extraction
property, the following properties will be created:path_type
: will contain a string with the value ofextraction
in this case.record_id
: the position (index) of the matched record in theextraction
array.template_name
: the record template name.field_id
: the position (index) of the matched field inside its father record.field_name
: the name of the matched field.field_value
: the value of the matched field.field_instance
: theinstance
array of the matched field.field_instance_offset
: an array of objects containing all the positional information of the matched extractions, structured as:begin
: the starting position of that textual instance.end
: the ending position of that textual instance.length
: the length of the textual match.
field_instance_text
: an array containing all the non-normalized pieces of text matched by this extraction.field_confidence
: the confidence score of the matched field.field_siblings
: an array of objects containing information about the sibling of the matched field, useful to check aggregated data:field_name
: the name of the sibling field.field_value
: the value of the sibling field.field_id
: the position (index) of the sibling field inside its father record.
- for a category in the
categorization
property, the following properties will be created:path_type
: will contain a string with the value ofcategorization
in this case.id
: the position (index) of the matched category in thecategorization
array.name
: the name of the matched category.label
: the label of the matched category.score
: the score of the matched category.compound
: the compound score of the matched category.frequency
: the frequency of the matched category.winner
: thewinner
status of the matched category.rules
: therules
array of the matched category.
- for a segment in the
segment
property, the following properties will be created:path_type
: will contain a string with the value ofsegment
in this case.index
: the index of the matched segment within thesegment
array.name
: the name of the matched segment.positions
: an array of objects containing all the positional information of the matched segment, structured as:begin
: the starting position of that segment instance.end
: the ending position of that segment instance.score
: the score of that segment instance.rules
: the rules information of that segment instance.
- for a section in the
sections
property, the following properties will be created:path_type
: will contain a string with the value ofsections
in this case.index
: the index of the matched section within thesections
array.name
: the name of the matched section.positions
: an array of objects containing all the positional information of the matched section, structured as:begin
: the starting position of that segment instance.end
: the ending position of that segment instance.
By creating this output, the user can leverage the JSONPath query language to select and loop only certain items for more advanced use cases which cannot be solved with the standard jsonPlug method actions.
Warning
- This output shows a snapshot of the
result
object at the time the queryJsonPath method is invoked and will not reflect any changes made after the output is generated
'regex object' output type
If outputType is set to regex object
, the method will return a single object that includes seven properties:
path_type
: this property will always have a value of extraction since currently onlyextractions
are supported by this output.record_ids
: a regex that matches all the indexes of the matched recordstemplates
: a regex that matches all the templates values matched by the jsonPath queryfield_names
: a regex that matches all the field names matched by the jsonPath queryfield_values
: a regex that matches all the extracted values matched by the jsonPath queryfield_instance_offsets
: a regex that matches all the extraction offsets matched by the jsonPath queryfield_instance_text
: a regex that matches all the instance texts matched by the jsonPath query
These properties provide advanced regex capabilities that can be used to quickly validate complex aggregated data.
Warning
- This output is based on a snapshot of the
result
object at the time the queryJsonPath method is invoked and will not reflect any changes made after the output is generated - To use the
record_ids
regex, you must set theaddIndexToRecords
option totrue
.
Here's an example:
function onFinalize(result) {
var false_covers = jsonPlug.queryJsonPath(result, {
jsPath: "$.match_info.rules.extraction[?(@.template == 'Covers')].fields[?(@.value == 'false')]",
outputType: "regex object"
});
var false_covers_ids = false_covers.record_ids;
var false_covers_names = false_covers.field_names;
jsonPlug.jsonPlug(result, {
action: "delete",
jsPathConditionFlag: true,
jsPath: "$.match_info.rules.extraction[?(@.template == 'Covers' && " + false_covers_ids +
".test(@.index) )].fields[?(" + false_covers_names + ".test(@.field) && @.value == 'true' )]",
jsPathAction: "#this#",
recursive: true,
addIndexToRecords: true
});
return result;
}
This code aims to identify records where the same cover has both a value of false
and true
. To avoid hard-coding all the possible cover names (which can be hundreds), the code performs the following steps:
- Matches all the records named
Covers
having a child-field with an extracted value offalse
, and extracts the record ids to know which records present this issue. - Calls the main
jsonPlug
method once the querying phase is over, using theaddIndexToRecords
option, which adds an index property to each record. - Uses the advanced info provided by the regex object to identify which records suffer from the issue and only enters those.
- Validates only the field names presenting the duplication issue, matches those with an extracted value of
true
. - Uses the the
delete
action to remove those fields.