tagHierarchy
Overview
The tagHierarchy module allows you to create a hierarchy between multiple tags on a token and alter the tagging in the Semantic Analysis tool window and in the final output. It must be used in the onTagger
function.
When you install the tagHierarchy module in your project, Studio modifies the main.jr file to insert this statement at the beginning of the file:
var tagHierarchy = require('modules/tagHierarchy');
This module has one method called tagHierarchy
.
Consider this example:
TEMPLATE(INJURY)
{
@TYPE_OF_INJURY,
@LOCATION_OF_INJURY
}
TAGS
{
@INJURY_TYPE,
@INJURY_LOCATION
}
...
SCOPE SENTENCE
{
TAGGER()
{
@INJURY_TYPE[SYNCON(105781808)] // backache
}
TAGGER()
{
@INJURY_LOCATION[SYNCON(105781808)] // backache
}
IDENTIFY(INJURY)
{
@TYPE_OF_INJURY[TAG(INJURY_LOCATION)]|[TAG]
}
}
If the rule is applied to this text:
The patient suffers from backache
you will get:
Disambiguation | Output |
---|---|
With this code:
function onTagger() {
tagHierarchy.tagHierarchy("string", [["INJURY_LOCATION", "INJURY_TYPE"]])
}
Or:
function onTagger() {
tagHierarchy.tagHierarchy("string", [["ALWAYS!", "INJURY_TYPE"]])
}
Or:
function onTagger() {
tagHierarchy.tagHierarchy("string", "list1.cl")
}
you will get:
Disambiguation | Output |
---|---|
As you can see in the disambiguation panel, the weak tag INJURY_TYPE is highlighted and a new line called UNTAG—containing the tag to remove—appears. The final output is also modified with the strong tag as the prevailing one.
The syntax of this method is:
moduleVariable.tagHierarchy(tagDataType, hierarchicalRelationships)
Or:
moduleVariable.tagHierarchy(tagDataType, listFilePath)
where:
moduleVariable
is the variable corresponding to the module and set withrequire()
.tagDataType
is the tag data type.-
hierarchicalRelationships
is an array of arrays. Each inner array must have two elements:- The strong prevailing tag.
- The weak tag.
Instead of the strong tag, you can also use the
ALWAYS!
value. It is a special value that always removes the weak tag from the output and the disambiguation letting the other token tag(s) prevail -
listFilePath
is the path of the list file in which you define your hierarchies.
List file syntaxes
You can write list files with four different syntaxes:
strongTag_1, weakTag_1
strongTag_2, weakTag_2
...
strongTag_n, weakTag_n
Or:
strongTag_1 REMOVES weakTag_1
strongTag_2 REMOVES weakTag_2
...
strongTag_n REMOVES weakTag_n
or you can use regular expressions, like this:
/^strongTag_1$/i REMOVES /^weakTag_1$/i
/^strongTag_2$/i REMOVES /^weakTag_2$/i
...
/^strongTag_n$/i REMOVES /^weakTag_n$/i
Or:
/^strongTag_1$/i, /^weakTag_1$/i
/^strongTag_2$/i, /^weakTag_2$/i
...
/^strongTag_n$/i, /^weakTag_n$/i
where REMOVES
is the operator that removes the weak tag. You can use it interchangeably with the comma (,
), as long as only one of them is used per line.
Note
You can use these list files syntaxes interchangeably.