Manage tagging
Introduction
The pre-defined DIS
object provides the following methods to manage tagging:
-
To add tags:
tagToken
tagPhrase
tagSentence
tagRange
-
To untag:
untagToken
untagPhrase
untagSentence
-
To rename tags:
renameTag
-
To add tags with values corresponding to their TagEntry:
tagTokenWithValue
tagPhraseWithValue
tagSentenceWithValue
tagRangeWithValue
-
To get the token TagEntry:
getTokenTagEntry
Tag and untag methods
Tag and untag methods add or remove a tag to/from all the tokens of a specified text subdivision.
The untag methods don't just remove tags, they turn them into negative tags, or "untags". Negative tags do not have the effect of tags, but remain stored in the text analysis results.
If the definition of a tag includes a syncon ID, untagging methods will also restore the tokens' original syncon ID.
Info
Untagging information is highlighted in the detail panel of the Semantic Analysis tool window of the Studio IDE.
tagSentence and untagSentence
Use tagSentence
and untagSentence
to tag and untag entire sentences.
With this statement:
DIS.tagSentence(1, "MYTAG");
applied to this text:
This a sentence. This is another sentence.
a MYTAG tag spanning all the tokens of the second sentence—that whose zero-based index is 1—will be added to the analysis results. The value of the tag is the text of the tagged sentence.
This code instead:
DIS.untagSentence(1, "myTAG");
will transform the above tag in a negative tag, or "untag".
tagSentenceWithValue
Use tagSentenceWithValue
to tag entire sentences and assign the tags a custom value.
With this statement:
DIS.tagSentenceWithValue(1, "CUBE_INVENTOR", "Erno Rubik");
applied to this text:
The example is in the next sentence. A Hungarian architect invented the most famous cube in the world.
a CUBE_INVENTOR tag spanning all the tokens of the second sentence—that whose zero-based index is 1—will be added to the analysis results. The value of the tag is Erno Rubik.
tagPhrase and untagPhrase
Use tagPhrase
and untagPhrase
to tag and untag phrases.
With this statement:
DIS.tagPhrase(2, "BEER");
applied to this text:
I drank a Guinness beer.
a BEER tag spanning all the tokens of the third phrase a Guinness beer—that whose zero-based index is 3—will be added to the analysis results. The value of the tag is the text of the tagged phrase.
This code instead:
DIS.untagToken(2, "BEER");
will transform the above tag in a negative tag.
tagPhraseWithValue
Use tagPhraseWithValue
to tag phrases and assign the tags a custom value.
With this statement:
DIS.tagPhraseWithValue(2, "BEER", "Irish beers");
applied to this text:
I drank a Guinness beer.
a BEER tag spanning all the tokens of the third phrase a Guinness beer—that whose zero-based index is 3—will be added to the analysis results. The value of the tag is Irish beers.
tagToken
Use tagToken
to add a tag to a token.
If you apply this code:
DIS.tagToken(3, "COLOR");
to this input text:
The sky is blue.
tag COLOR will be added to the fourth token (blue), that is the token with zero-based index 3. The value of the tag will be the text of the tagged token.
tagTokenWithValue
Use tagToken
to add a tag to a token and assign the tag a custom value.
If you apply this code:
DIS.tagTokenWithValue(3, "COLOR", "sky color");
to this input text:
The sky is blue.
tag COLOR tag will be added to the fourth token (blue), that is the token with zero-based index 3. The value of the tag will be sky color.
untagToken
Use untagToken
to transform a token's tag in a negative tag.
For example, given this definition:
TAGS
{
@MYTAG
}
this tagging rule:
SCOPE SENTENCE
{
TAGGER()
{
@MYTAG[KEYWORD("had a good time")]
}
}
applied to this text:
I went to Miami and I had a good time.
will add a default (10000) level MYTAG tag spanning the tokens from had to time. The value of the tag will the the text matched by the tagging rule (had a good time).
Then, this JavaScript statement:
DIS.untagToken(7, "MYTAG");
will make the above tag negative.
It is sufficient to apply untagToken
to one token to "flip" a tag spanning multiple tokens.
tagRange
Use tagRange
to add a tag to a range of consecutive tokens.
For example, when this statement:
DIS.tagRange(3, 4, "DOG");
is applied to this text:
Rex is a German Shepherd.
a DOG tag spanning German and Sheperd—which have, respectively, zero-based indexes 3 and 4—will be added to the analysis results. The value of the tag will be the text of the two tokens, including the space between the two.
tagRangeWithValue
Use tagRange
to add a tag to a range of consecutive tokens and assign the tag a custom value.
For example, when this statement:
DIS.tagRangeWithValue(3, 4, "DOG", "Police dogs");
is applied to this text:
Rex is a German Shepherd.
a DOG tag spanning German and Sheperd—which have, respectively, zero-based indexes 3 and 4—will be added to the analysis results. The value of the tag will be Police dogs.
Rename tag
Use renameTag
to simultaneously to replace a tag with another tag on the same level.
Consider for example this definition:
TAGS
{
@FIRST,
@SECOND
}
If this tagging rule:
SCOPE SENTENCE
{
TAGGER()
{
@FIRST[LEMMA("good")]
}
}
is applied to this text:
Mark is a good boy.
tag FIRST, covering token good, is added to the analysis results. Then, this JavaScript statement:
DIS.renameTag("FIRST", "SECOND", 3);
untags the FIRST tag and adds tag SECOND to the fourth token, that is the one with zero-based index 3. Since tag FIRST had level 10000, which is the default for rule generated tags, tag SECOND will inherit that same level.
getTokenTagEntry
getTokenTagEntry
returns an array of objects containing all the tags associated to a token and their corresponding tag entries. The method must be used in the onFinalize
function.
For example, given this code:
function onTagger() {
DIS.tagTokenWithValue(1, "MY_TAG1", "Christmas");
DIS.tagTokenWithValue(1, "MY_TAG2", "December 25th");
}
applied to this text:
On Xmas, I will buy a new computer.
two tags with their tag entry are assigned to the token Xmas, that is the token with zero-based index 1.
With this code:
DIS.getTokenTagEntry(1);
the method returns this:
[
{
"tag": "MY_TAG1",
"entry": "Christmas"
},
{
"tag": "MY_TAG2",
"entry": "December 25th"
}
]
Syntax
Except for tagRange
and renameTag
, all tagging and untagging methods without custom value have this syntax:
DIS.method(subdivisionIndex, tagLabel)
where:
method
is the tagging or untagging method.subdivisionIndex
is the index of the text subdivision to tag.tagLabel
is the tag label.
All tagging methods with custom value except for tagRangeWithValue
have the same syntax above plus a new parameter:
DIS.method(subdivisionIndex, tagLabel, tagValue)
where tagValue
is the tag value.
The syntax of tagRange
is:
DIS.tagRange(token1Index, token2Index, tagLabel)
where:
token1Index
is the token index of the first token in the range to tag.token2Index
is the token index of the last token in the range to tag.tagLabel
is the tag label.
The syntax of tagRangeWithValue
is the same plus a new parameter:
DIS.tagRangeWithValue(token1Index, token2Index, tagLabel, tagValue)
where tagValue
is the tag value.
The syntax of renameTag
is:
DIS.renameTag(oldTagLabel, newTagLabel, subdivisionIndex)
where:
oldTagLabel
is the label of the existing tag.newTagLabel
is the label of the new tag.subdivisionIndex
is the index of the text subdivision to tag.
The syntax of getTokenTagEntry
is:
DIS.getTokenTagEntry(tokenIndex)
where tokenIndex
is the index of the token to get tags and tag entries from.