The disambiguator recognizes entities like people's names, dates, addresses, monetary values, measures, etc., when present in a text. For instance, in the sentence:
John Smith lives in a 900 sq ft apartment at 22 Green Park Street.
the disambiguator recognizes John Smith as a person's name (entity type
NPH), 900 sq ft as a measure (entity type
MEA) and 22 Green Park Street as address (entity type
Recognized entities cited in a text can be matched by a rule condition using the
TYPE attribute and the entity type.
Some of them are also referred to as structured entities, because they are usually made of components like numbers, letters and punctuation marks.
As mentioned above, a structured entity is an aggregation of components, for example a date contains at least a day and a month or a month and a year, an address contains at least a street number and a street name, etc.
ADR types have two properties:
- They are always associated with a virtual supernomen which specifies what type of entity they are.
- They can be subdivided into logical components.
The first property is related to the fact that tokens like 22 Green Park Street don't correspond to standard Knowledge Graph syncons. Yet the disambiguator understands it's an address and assigns it the meaning of street. Therefore, unknown entities automatically receive a syncon ID of a known concept.
February 28, 1893
is recognized as a date and as an instance of the
DAT type during disambiguation, but it's also assigned the virtual supernomen corresponding to syncon 65454 (date, tag_date). Due to this recognition, the disambiguator can also distinguish between the day, the month and the year and these components can be used in rules using the
In another example:
900 sq ft
the text is recognized as an instance of
MEA type and is assigned the virtual supernomen corresponding to syncon 58572 (square foot). Such an entity has two parts: the numeric value (
900) and the unit of measurement (
TRANSFORM feature allows the user to define the way in which structured entities can be divided into components.
Below is a list of the virtual supernomens and their corresponding entity types along with the concepts of their components. Check the Knowledge Graph to find the syncon ID for each of them depending on your project language.
|Entity type||Virtual supernomen||Components' syncons|
||cat. person||tag_first_name, tag_surname, tag_gender|
||tag_date||tag_weekday, tag_day, tag_month, tag_year|
||cat. unit of measurement||tag_number, cat. unit of measurement|
||cat. money||tag_number, tag_currency|
||street||tag_road, tag_proper_noun, tag_street_number|