ANCESTOR attribute
Overview
The ANCESTOR
attribute, like SYNCON
, matches text tokens based on their meaning. While SYNCON
matches syncon IDs listed as arguments to the syncon IDs of tokens, ANCESTOR
expands the syncon IDs specified as arguments to produce lists of syncon IDs by navigating the knowledge graph along links.
The link that is navigated by default is supernomen/subnomen for nouns and superverbum/subverbum for verbs.
Navigation starts from each syncon ID specified as an argument and by default proceeds downwards, recursively, i.e. for each syncon reached all its downward links of the same type are also followed and so on for each other syncon reached. By default, navigation ends when all possible links have been followed.
For example, in:
ANCESTOR(100003947)
100003947 is the ID of the syncon for house, which grammatically is a noun.
The produced list includes this ID plus the IDs of all syncons linked to it downwards, recursively, along the supernomen/subnomen link.
The syncons directly linked downwards to syncon 100003947 via supernomen/subnomen links are:
100850107 (bungalow, cottage)
105784097 (hut, cabin)
18020 (maisonette, maisonnette)
22091 (villa)
100751876 (chateau, château)
...
These syncons can in turn be linked downwards with other syncons. For example, 105784097 (hut, cabin) is linked to:
208211 (bush house)
209536 (chalet)
105619559 (gunya, gunyah)
227067 (rancho)
100695207 (log cabin)
...
Proceeding in this way the IDs of all the "descendant" syncons of 100003947 in the supernomen/subnomen hierarchy are obtained. ANCESTOR(ID)
can therefore be interpreted as: the specified ancestor and all of his descendants. This is the basic use of the attribute and the reason for its name.
Any token with a syncon ID included in the list obtained above is matched by the attribute and the operand is true. For example, given this text:
The chalet near the ski slopes was cozy.
the operand is true for the word chalet because the disambiguator attributes to the word the meaning corresponding to the syncon with ID 209536 which, in the supernomen/subnomen hierarchy, is a descendant of the syncon with ID 100003947 (house).
The basic syntax is:
ANCESTOR(ID1[, ID2, ...])
where ID#
is the identifier of the syncon to start the navigation from. By specifying multiple arguments, therefore, it is possible to match multiple sets of syncon IDs.
Recursion limit and link type
For each argument it is possible to limit the number of recursions for link navigation and specify a link type different from the default one.
The syntax is:
ID#:level[:link]
where level is the number of recursions and link is the name of the link.
level
has two conventional values:
- 0: no recursion, only the ID given as an argument is considered.
- 99: maximum recursion, all links are followed until they are exhausted. Use this value only it is necessary to also specify a type of link different from the default one, otherwise it can be omitted because by default the recursion is maximum.
Other level values correspond to the actual number of recursions, so 1 is for the syncons directly linked to the specified syncon, 2 to also include the syncons linked to the previous ones and so on.
Valid link types are those available in the knowledge graph, including any custom link added for a specific project. For example, in:
ANCESTOR(100000145:2:syncon/geography)
100000145 corresponds to United Kingdom, so the attribute is true for tokens like United Kingdom, Scotland ("child" of United Kingdom in the syncon/geography hierarchy) plus any other country of United Kingdom, and Edinburgh ("child" of Scotland) plus any other city of any country of the United Kingdom.
If the link name contains spaces or dashes it should be written in quotation marks.
Matching the parent syncon
When a text token is recognized as a noun that does not correspond to any lemma in the knowledge graph, the disambiguator cannot attribute the ID of a syncon to the token.
Nevertheless, sometimes the disambiguator manages to understand that the concept expressed by the token is "a type of" an existing concept in the knowledge graph, i.e. it identifies a "parent" concept.
For example, in this text:
It is a type of pattern seen in the tiled Islamic mosaics at the Alhambra Palace in Spain and the Darb-i Imam shrine in Iran, but which had never been thought could exist in nature.
Darb-i Imam does not correspond to any lemma contained in the factory knowledge graph, but the disambiguator recognizes it as a type of shrine.
In cases like this, the token does not have a syncon ID, but receives from the disambiguator an ID of the parent syncon of which the token appears to represent a specialization.
In the case of the example, the token has the parent attribute set to 100697295, ID of the syncon for shrine.
ANCESTOR
also matches the parent attribute of tokens. For example, in:
ANCESTOR(100041176)
100041176 is the ID of the syncon for building, construction, edifice, etc. and is expanded to all its descendants in the supernomen/subnomen hierarchy. This lineage includes the concept of shrine, so the operand is true for the Darb-i Imam token because it matches the value of its parent attribute.
Matching unknown nouns
With the syntax:
ANCESTOR(UNKNOWN)
the attribute matches tokens grammatically recognized as nouns to which the disambiguator has not assigned either the ID of a syncon or the parent attribute (see above). For example, with this text:
Nobody knows what a gobbodilot is.
the above operand is true for the gobbodilot token.
Double link
By specifying a second link type, ANCESTOR
uses the first link type only to navigate the knowledge graph and find base syncons, while producing the list of syncon IDs to consider starting from each base syncon and taking the syncons directly linked to them (no recursion) with the second type of link. In details:
- Find all syncons recursively linked to the one whose ID is given as an argument by following the first link type for the specified number of recursions. Both the ID specified as the argument and the IDs of the syncons visited during this navigation are not added to the list.
- For each syncon visited during the navigation in the previous point, if there are syncons linked to it via the second type of link, add the IDs of those syncons to the list.
For example, for:
ANCESTOR(100041176:2:supernomen/subnomen:omninomen/parsnomen)
the syncon ID list is constructed like this:
- The navigation starts from the syncon with ID 100041176 (building, construction, edifice, ...) which is not included in the list.
- The IDs of any syncons linked downwards—without recursion—to the above syncon with the omninomen/parsnomen link are added to the list.
- The navigation goes on to any syncons linked downwards to the syncon with ID 100041176 by following the supernomen/subnomen link. These syncons are not added to the list, only visited.
- The IDs of any syncons linked downwards—without recursion—with the omninomen/parsnomen link to all the syncons visited in step 3 are added to the list.
- From each of the syncons visited in step 3, the navigation goes on to any syncons linked downwards following the supernomen/subnomen link. These syncons are not added to the list, only visited.
- The IDs of any syncons linked downwards—without recursion—with the omninomen/parsnomen link to all the syncons visited in step 5 are added to the list.
The above attribute matches tokens like foyer and dunny because the first is linked downwards, with the omninomen/parsnomen link, to the syncon for building and the second is linked downwards, again with the omninomen/parsnomen link, to the syncon for house, a child of the syncon for building in the supernomen/subnomen hierarchy.
Upwards navigation
To navigate knowledge graph links upwards instead of downwards, the name of the link type must be specified and prefixed with a dash (-
).
For example, this operand:
ANCESTOR(100003947:1:-supernomen/subnomen)
matches the token building in this text:
I live in a nice building.
because 100003947 corresponds to house and ANCESTOR
expands the list of syncon IDs by navigating upwards the supernomen/subnomen hierarchy, and the syncon for building is the parent of the syncon for house.
To correctly choose the navigation direction, knowledge of how each type of link is organized is necessary, i.e. which concepts are above and which are below.
Warning
If the argument syncon is at or near the root of a hierarchical link type and the number of recursions is high, the number of descendants in case of downwards navigation can be high. This can make the analysis slow, and also ANCESTOR(UNKNOWN)
can have the same effect. In these cases it is advisable to combine ANCESTOR
with other attributes or operands.