tagHierarchy
Overview
The tagHierarchy module allows you to create a hierarchy between multiple tags on a token and alter the tagging in the Semantic Analysis tool window and in the final output. It must be used in the onTagger
function, because such a function manages tagging code executed after the evaluation of tagging rules.
When you install the tagHierarchy module in your project, Studio modifies the main.jr file to insert this statement at the beginning of the file:
var tagHierarchy = require('modules/tagHierarchy');
This module has one method called tagHierarchy
.
Consider this example:
TEMPLATE(INJURY)
{
@TYPE_OF_INJURY,
@LOCATION_OF_INJURY
}
TAGS
{
@INJURY_TYPE,
@INJURY_LOCATION
}
...
SCOPE SENTENCE
{
TAGGER()
{
@INJURY_TYPE[SYNCON(105781808)] // backache
}
TAGGER()
{
@INJURY_LOCATION[SYNCON(105781808)] // backache
}
IDENTIFY(INJURY)
{
@TYPE_OF_INJURY[TAG(INJURY_LOCATION)]|[TAG]
}
}
If the rule is applied to this text:
The patient suffers from backache.
you will get the following information from the disambiguator:
Name | Value |
---|---|
Text | backache |
Base Form | backache |
Type | NOU |
Gender | Neutral |
Number | Singular |
Syncon | #105781808 |
TAG (10000) | INJURY_LOCATION |
TagEntry (10000) | backache |
TAG (10000) | INJURY_TYPE |
TagEntry (10000) | backache |
and this record as output:
Template: INJURY
Field | Value |
---|---|
@TYPE_OF_INJURY | INJURY_LOCATION |
With this code:
function onTagger() {
tagHierarchy.tagHierarchy("string", [["INJURY_LOCATION", "INJURY_TYPE"]])
}
Or:
function onTagger() {
tagHierarchy.tagHierarchy("string", [["ALWAYS!", "INJURY_TYPE"]])
}
Or:
function onTagger() {
tagHierarchy.tagHierarchy("string", "list1.cl")
}
you will get the following information from the disambiguator:
Name | Value |
---|---|
Text | backache |
Base Form | backache |
Type | NOU |
Gender | Neutral |
Number | Singular |
Syncon | #105781808 |
TAG (10000) | INJURY_LOCATION |
TagEntry (10000) | backache |
UNTAG(10000) | INJURY_TYPE |
TagEntry (10000) | backache |
and the same record as above:
A new highlighted line called UNTAG—containing the tag to remove—appears. The TagEntry is also highlighted and removed.
The syntax of this method is:
moduleVariable.tagHierarchy(tagDataType, hierarchicalRelationships)
Or:
moduleVariable.tagHierarchy(tagDataType, listFilePath)
where:
moduleVariable
is the variable corresponding to the module and set withrequire()
.tagDataType
allows you to define the tags to match as either astring
or aregex
(see the examples below).-
hierarchicalRelationships
is an array of arrays. Each inner array must have two elements:- The strong prevailing tag.
- The weak tag.
Instead of the strong tag, you can also use the
ALWAYS!
value. It is a special value that always removes the weak tag from the output and the disambiguation letting the other token tag(s) prevail -
listFilePath
is the path of the list file in which you define your hierarchies.
List file syntaxes
You can write list files with four different syntaxes:
strongTag_1, weakTag_1
strongTag_2, weakTag_2
...
strongTag_n, weakTag_n
Or:
strongTag_1 REMOVES weakTag_1
strongTag_2 REMOVES weakTag_2
...
strongTag_n REMOVES weakTag_n
or you can use regular expressions, like this:
/^strongTag_1$/i REMOVES /^weakTag_1$/i
/^strongTag_2$/i REMOVES /^weakTag_2$/i
...
/^strongTag_n$/i REMOVES /^weakTag_n$/i
Or:
/^strongTag_1$/i, /^weakTag_1$/i
/^strongTag_2$/i, /^weakTag_2$/i
...
/^strongTag_n$/i, /^weakTag_n$/i
where REMOVES
is the operator that removes the weak tag. You can use it interchangeably with the comma (,
), as long as only one of them is used per line.
Note
You can use these list files syntaxes interchangeably.
Warning
Performance-wise, it is recommended to use strings instead of regular expressions to define tags when using the module. Using strings to match tags results in faster processing times compared to using regular expressions, therefore consider utilizing strings to define tags for improved efficiency and responsiveness of your project, especially if you are analyzing long documents.