Skip to content

tagHierarchy

Overview

The tagHierarchy module allows you to create a hierarchy between multiple tags on a token and alter the tagging in the Semantic Analysis tool window and in the final output. It must be used in the onTagger function, because such a function manages tagging code executed after the evaluation of tagging rules.

When you install the tagHierarchy module in your project, Studio modifies the main.jr file to insert this statement at the beginning of the file:

var tagHierarchy = require('modules/tagHierarchy');

This module has one method called tagHierarchy.

Consider this example:

TEMPLATE(INJURY)
{
   @TYPE_OF_INJURY,
   @LOCATION_OF_INJURY
}

TAGS
{   
    @INJURY_TYPE,
    @INJURY_LOCATION
}
...

SCOPE SENTENCE
{
    TAGGER()
    {
        @INJURY_TYPE[SYNCON(105781808)] // backache
    }

    TAGGER()
    {
        @INJURY_LOCATION[SYNCON(105781808)] // backache
    }

    IDENTIFY(INJURY)
    {
        @TYPE_OF_INJURY[TAG(INJURY_LOCATION)]|[TAG]
    }
}

If the rule is applied to this text:

The patient suffers from backache.

you will get the following information from the disambiguator:

Name Value
Text backache
Base Form backache
Type NOU
Gender Neutral
Number Singular
Syncon #105781808
TAG (10000) INJURY_LOCATION
TagEntry (10000) backache
TAG (10000) INJURY_TYPE
TagEntry (10000) backache

and this record as output:

Template: INJURY

Field Value
@TYPE_OF_INJURY INJURY_LOCATION

With this code:

function onTagger() {
    tagHierarchy.tagHierarchy("string", [["INJURY_LOCATION", "INJURY_TYPE"]])
}

Or:

function onTagger() {
    tagHierarchy.tagHierarchy("string", [["ALWAYS!", "INJURY_TYPE"]])
}

Or:

function onTagger() {
    tagHierarchy.tagHierarchy("string", "list1.cl")
}

you will get the following information from the disambiguator:

Name Value
Text backache
Base Form backache
Type NOU
Gender Neutral
Number Singular
Syncon #105781808
TAG (10000) INJURY_LOCATION
TagEntry (10000) backache
UNTAG(10000) INJURY_TYPE
TagEntry (10000) backache

and the same record as above:

A new highlighted line called UNTAG—containing the tag to remove—appears. The TagEntry is also highlighted and removed.

The syntax of this method is:

moduleVariable.tagHierarchy(tagDataType, hierarchicalRelationships)

Or:

moduleVariable.tagHierarchy(tagDataType, listFilePath)

where:

  • moduleVariable is the variable corresponding to the module and set with require().
  • tagDataType allows you to define the tags to match as either a string or a regex (see the examples below).
  • hierarchicalRelationships is an array of arrays. Each inner array must have two elements:

    • The strong prevailing tag.
    • The weak tag.

    Instead of the strong tag, you can also use the ALWAYS! value. It is a special value that always removes the weak tag from the output and the disambiguation letting the other token tag(s) prevail

  • listFilePath is the path of the list file in which you define your hierarchies.

List file syntaxes

You can write list files with four different syntaxes:

strongTag_1, weakTag_1
strongTag_2, weakTag_2
...
strongTag_n, weakTag_n

Or:

strongTag_1 REMOVES weakTag_1
strongTag_2 REMOVES weakTag_2
...
strongTag_n REMOVES weakTag_n

or you can use regular expressions, like this:

/^strongTag_1$/i REMOVES /^weakTag_1$/i
/^strongTag_2$/i REMOVES /^weakTag_2$/i
...
/^strongTag_n$/i REMOVES /^weakTag_n$/i

Or:

/^strongTag_1$/i, /^weakTag_1$/i
/^strongTag_2$/i, /^weakTag_2$/i
...
/^strongTag_n$/i, /^weakTag_n$/i

where REMOVES is the operator that removes the weak tag. You can use it interchangeably with the comma (,), as long as only one of them is used per line.

Note

You can use these list files syntaxes interchangeably.

Warning

Performance-wise, it is recommended to use strings instead of regular expressions to define tags when using the module. Using strings to match tags results in faster processing times compared to using regular expressions, therefore consider utilizing strings to define tags for improved efficiency and responsiveness of your project, especially if you are analyzing long documents.