LIST attribute
Introduction
The LIST
attribute identifies a token by specifying the numeric ID of a syncon and considering the syncon itself as just a container of lemmas. The token is recognized in a document if it matches one of the lemmas contained in the specified syncon, regardless of the concept that the syncon represents.
The syntax is:
LIST(ID1[, ID2, ...])
or:
LIST(string1[, string2, ...])
where:
LIST
is the attribute name and must be written in uppercase.ID#
refers to the unique ID assigned to a knowledge graph syncon. Unknown numbers are not accepted.string#
refers to any sequence of alphabetical characters, numbers and punctuation marks. Any of the strings to be recognized in a document can be made up of one or several words but must be written between quotation marks.
Note
You can use strings along with IDs within the LIST
attribute.
ID syntax
Similar to the SYNCON
attribute, the LIST
attribute allows the user to specify the ID of a concept (syncon) contained in the knowledge graph. The difference is that SYNCON
considers both the form of a word and its contextual meaning, while LIST
considers the form. When the SYNCON
attribute is used in a rule, two conditions have to be verified for a token to be identified in a document:
- The token must match one of the lemmas that is part of the syncon.
- The token must be associated to the meaning represented by the syncon during the disambiguation process.
The second condition however, is optional in the LIST
attribute. In fact, the syncon ID is considered merely as a collection of lemmas. All synonyms, variants, abbreviations etc. that are part of the syncon are matched if found in a document. Synonyms and variants are also recognized both in their base form and inflected forms. This will occur whether the token is disambiguated as instance of the specified syncon in the rule or not.
LIST
is particularly useful when the token to be recognized in a document is ambiguous, i.e., a word is contained in several syncons representing slightly different meanings and could be disambiguated in several ways. For example, the word glass in the knowledge graph yields different types of glasses, such as a container for holding liquids (syncon 16634), the quantity a glass will hold (syncon 59462), an article made of glass (syncon 16639) etc. In such cases, it is better to identify the word and its synonyms without taking into account how the lemmas are disambiguated. In fact, a token matches a LIST
rule even thought the syncon associated to it is not the same one specified in the rule.
The LIST
attribute is also similar to the ANCESTOR
attribute in that it allows the user to specify a numeric ID of a syncon contained in the knowledge graph, and consider it as the starting point of a chain of concepts and therefore, match all synonyms in all syncons as lemmas. To enable this function of the LIST
attribute, add a colon (:
) after the syncon ID, followed by the number of levels to be navigated downward. The syntax is:
LIST(ID1:levelNumber[, ID2:levelNumber, ...])
where levelNumber
ranges from 0 to 99, where 0 is the root which only considers the first level and 99 is the default value that considers all levels.
Note
Unlike the ANCESTOR
attribute which considers the whole chain, LIST
will only consider the lemmas that are part of the selected syncon ID, if no level is specified.
It is also possible to specify a link to be navigated when looking for descendants. This can be done by adding another colon (:
) after the level number followed by the name of the link. The syntax is:
LIST(ID1:levelNumber:linkName[, ID2:levelNumber:linkName, ...])
Valid links are those available in the knowledge graph, including any custom link added for a specific project. If the given ancestor is a noun and no link name is specified, the supernomen/subnomen
("part of" type of relation) link will be navigated by default. If the given ancestor is a verb and no link name is specified, the superverbum/subverbum
("way of" type of relation) link will be navigated by default. Any other links must be specified in order to be considered in the rule.
Consider the following examples:
LIST(171508)
Syncon 171508 refers to the word write as a verb with a meaning of to create books, poems, news paper articles and other original pieces of text. The verb to write can be interpreted in several ways (see syncons 73459, 73491, 171506, etc.), although, in the end, all of them are slightly different variations of the same action. In this case, it is useful to recognize some synonyms without taking into account how these lemmas are disambiguated. Thus, the rule above will recognize the lemmas compose and pen that are synonymous within syncon 171508 along with the lemma write. Tokens such as write, writing, written will also be matched by the rule even though the syncon associated with these inflected forms may not be syncon 171508 but perhaps 73459 or 73491 or 171506 etc.
Consider the following example:
LIST(73459:1)
In this example, the lemma write and the lemmas scribble and scrawl found in syncon 73691 as its first level descendant (on the superverbum/subverbum
link) will be matched, if found in a document. The same goes for the lemmas handwrite and hand-write (syncon 70235) as well as the lemma spell (syncon 69957). However, the lemmas "misspell" and "hyphenate" (syncons 69958 and 73499, respectively) will not be matched because they belong to the second level of descendants.
String syntax
You can use strings instead of IDs with the LIST
attribute. In this case, the LIST
attribute works exactly like the LEMMA
attribute.
Some things to remark:
- The lemmas must be available in the knowledge graph.
levelNumber
andlinkName
parameters described above cannot be used with strings.