Manage tagging

Introduction

The pre-defined DIS object provides the following methods to manage tagging:

To add tags:
- tagToken
- tagPhrase
- tagSentence
- tagRange
To untag:
- untagToken
- untagPhrase
- untagSentence
To rename tags:
- renameTag
To add tags with values corresponding to their TagEntry:
- tagTokenWithValue
- tagPhraseWithValue
- tagSentenceWithValue
- tagRangeWithValue
To get the token TagEntry:
- getTokenTagEntry

Tag and untag methods

Tag and untag methods add or remove a tag to/from all the tokens of a specified text subdivision.

The untag methods don't just remove tags, they turn them into negative tags, or "untags". Negative tags do not have the effect of tags, but remain stored in the text analysis results.

If the definition of a tag includes a syncon ID, untagging methods will also restore the tokens' original syncon ID.

Info

Untagging information is highlighted in the detail panel of the Semantic Analysis tool window of the Studio IDE.

tagSentence and untagSentence

Use tagSentence and untagSentence to tag and untag entire sentences.

With this statement:

DIS.tagSentence(1, "MYTAG");

applied to this text:

This a sentence. This is another sentence.

a MYTAG tag spanning all the tokens of the second sentence—that whose zero-based index is 1—will be added to the analysis results. The value of the tag is the text of the tagged sentence.
This code instead:

DIS.untagSentence(1, "myTAG");

will transform the above tag in a negative tag, or "untag".

tagSentenceWithValue

Use tagSentenceWithValue to tag entire sentences and assign the tags a custom value.

With this statement:

DIS.tagSentenceWithValue(1, "CUBE_INVENTOR", "Erno Rubik");

applied to this text:

The example is in the next sentence. A Hungarian architect invented the most famous cube in the world.

a CUBE_INVENTOR tag spanning all the tokens of the second sentence—that whose zero-based index is 1—will be added to the analysis results. The value of the tag is Erno Rubik.

tagPhrase and untagPhrase

Use tagPhrase and untagPhrase to tag and untag phrases.

With this statement:

DIS.tagPhrase(2, "BEER");

applied to this text:

I drank a Guinness beer.

a BEER tag spanning all the tokens of the third phrase a Guinness beer—that whose zero-based index is 3—will be added to the analysis results. The value of the tag is the text of the tagged phrase.
This code instead:

DIS.untagToken(2, "BEER");

will transform the above tag in a negative tag.

tagPhraseWithValue

Use tagPhraseWithValue to tag phrases and assign the tags a custom value.

With this statement:

DIS.tagPhraseWithValue(2, "BEER", "Irish beers");

applied to this text:

I drank a Guinness beer.

a BEER tag spanning all the tokens of the third phrase a Guinness beer—that whose zero-based index is 3—will be added to the analysis results. The value of the tag is Irish beers.

tagToken

Use tagToken to add a tag to a token.

If you apply this code:

DIS.tagToken(3, "COLOR");

to this input text:

The sky is blue.

tag COLOR will be added to the fourth token (blue), that is the token with zero-based index 3. The value of the tag will be the text of the tagged token.

tagTokenWithValue

Use tagToken to add a tag to a token and assign the tag a custom value.

If you apply this code:

DIS.tagTokenWithValue(3, "COLOR", "sky color");

to this input text:

The sky is blue.

tag COLOR tag will be added to the fourth token (blue), that is the token with zero-based index 3. The value of the tag will be sky color.

untagToken

Use untagToken to transform a token's tag in a negative tag.

For example, given this definition:

TAGS
{
    @MYTAG
}

this tagging rule:

SCOPE SENTENCE
{
    TAGGER()
    {
        @MYTAG[KEYWORD("had a good time")]
    }
}

applied to this text:

I went to Miami and I had a good time.

will add a default (10000) level MYTAG tag spanning the tokens from had to time. The value of the tag will the the text matched by the tagging rule (had a good time).
Then, this JavaScript statement:

DIS.untagToken(7, "MYTAG");

will make the above tag negative.
It is sufficient to apply untagToken to one token to "flip" a tag spanning multiple tokens.

tagRange

Use tagRange to add a tag to a range of consecutive tokens.

For example, when this statement:

DIS.tagRange(3, 4, "DOG");

is applied to this text:

Rex is a German Shepherd.

a DOG tag spanning German and Sheperd—which have, respectively, zero-based indexes 3 and 4—will be added to the analysis results. The value of the tag will be the text of the two tokens, including the space between the two.

tagRangeWithValue

Use tagRange to add a tag to a range of consecutive tokens and assign the tag a custom value.

For example, when this statement:

DIS.tagRangeWithValue(3, 4, "DOG", "Police dogs");

is applied to this text:

Rex is a German Shepherd.

a DOG tag spanning German and Sheperd—which have, respectively, zero-based indexes 3 and 4—will be added to the analysis results. The value of the tag will be Police dogs.

Rename tag

Use renameTag to simultaneously to replace a tag with another tag on the same level.
Consider for example this definition:

TAGS
{
    @FIRST,
    @SECOND
}

If this tagging rule:

SCOPE SENTENCE 
{
    TAGGER()
    {
        @FIRST[LEMMA("good")]
    }
}

is applied to this text:

Mark is a good boy.

tag FIRST, covering token good, is added to the analysis results. Then, this JavaScript statement:

DIS.renameTag("FIRST", "SECOND", 3);

untags the FIRST tag and adds tag SECOND to the fourth token, that is the one with zero-based index 3. Since tag FIRST had level 10000, which is the default for rule generated tags, tag SECOND will inherit that same level.

getTokenTagEntry

getTokenTagEntry returns an array of objects containing all the tags associated to a token and their corresponding tag entries. The method must be used in the onFinalize function.

For example, given this code:

function onTagger() {
DIS.tagTokenWithValue(1, "MY_TAG1", "Christmas");
DIS.tagTokenWithValue(1, "MY_TAG2", "December 25th");
}

applied to this text:

On Xmas, I will buy a new computer.

two tags with their tag entry are assigned to the token Xmas, that is the token with zero-based index 1.

With this code:

DIS.getTokenTagEntry(1);

the method returns this:

[
    {
        "tag": "MY_TAG1",
        "entry": "Christmas"
    },
    {
        "tag": "MY_TAG2",
        "entry": "December 25th"
    }
]

Syntax

Except for tagRange and renameTag, all tagging and untagging methods without custom value have this syntax:

DIS.method(subdivisionIndex, tagLabel)

where:

method is the tagging or untagging method.
subdivisionIndex is the index of the text subdivision to tag.
tagLabel is the tag label.

All tagging methods with custom value except for tagRangeWithValue have the same syntax above plus a new parameter:

DIS.method(subdivisionIndex, tagLabel, tagValue)

where tagValue is the tag value.

The syntax of tagRange is:

DIS.tagRange(token1Index, token2Index, tagLabel)

where:

token1Index is the token index of the first token in the range to tag.
token2Index is the token index of the last token in the range to tag.
tagLabel is the tag label.

The syntax of tagRangeWithValue is the same plus a new parameter:

DIS.tagRangeWithValue(token1Index, token2Index, tagLabel, tagValue)

where tagValue is the tag value.

The syntax of renameTag is:

DIS.renameTag(oldTagLabel, newTagLabel, subdivisionIndex)

where:

oldTagLabel is the label of the existing tag.
newTagLabel is the label of the new tag.
subdivisionIndex is the index of the text subdivision to tag.

The syntax of getTokenTagEntry is:

DIS.getTokenTagEntry(tokenIndex)

where tokenIndex is the index of the token to get tags and tag entries from.