TYPE attribute overview
Basic syntax
The TYPE
attribute matches the type of tokens, that is their word class or their entity type.
The syntax is:
TYPE(type1[, type2, ...])
where:
TYPE
is the attribute name and must be written in uppercase.type#
refers to a predefined value chosen between the language word classes—for example nouns, verbs, adjectives—and the types of entity the disambiguator can recognize—for example person names, dates, addresses.
Available world classes and entity types are listed below.
Word classes
Word class | Description |
---|---|
ADJ |
Adjective |
ART |
Article |
AUX |
Auxiliary verb |
ADV |
Adverb |
CON |
Conjunction |
NOU |
Noun |
NPR |
Proper noun |
PNT |
Punctuation mark |
PRE |
Preposition |
PRO |
Pronoun |
PRT |
Particle |
VER |
Verb |
Note
The PRT
attribute is available for German and French only.
Entity types
Label | Description | Example |
---|---|---|
ADR |
Street address | Who lived at 221B Baker Street? |
ANM |
Animal | Felix is an anthropomorphic black cat. |
BLD |
Building | While in London I attended a concert at the Royal Albert Hall. |
COM |
Company, business | Tesla Inc. sold 10% of its Bitcoin holdings. |
DAT |
Date | Napoleon died on May 5, 1821. |
DEV |
Device | My new Galaxy smartphone has seven cameras. |
DOC |
Document | I appeal to the Geneva Convention! |
ENT |
Generic entity | I have five minutes left. |
EVN |
Event | Felice Gimondi won the Tour de France in 1965. |
FDD |
Food, beverage | Frank likes to drink Guinness beer. |
GEA |
Physical geographic feature | I crossed the Mississipi river with my boat. |
GEO |
Administrative geographic area | Alaska is the least densely populated state in the United States. |
GEX |
Extended geography | The astronauts have landed on Mars. |
HOU |
Hours | The eclipse reached its peak at 3pm. |
LEN |
Legal entity | Of course I pay the FICA tax. |
MAI |
Email address | For any questions do not hesitate to write to [email protected]. |
MEA |
Measure | The chest is five feet wide and 40 inches tall. |
MMD |
Mass media | I read it in the Guardian. |
MON |
Money | I sold half of my stocks and made six hundred thousand dollars. |
NPH |
Person | Hakeem Olajuwon dunked effortlessly. |
ORG |
Organization, institution, society | Now they threaten to quit the United Nations if they are not heard. |
PCT |
Percentage | The richest 10% of adults in the world own 85% of global wealth. |
PHO |
Phone number | For poor database design, call (214) 748-3647. |
PPH |
Physical phenomena | The COVID-19 infection is slowing down. |
PRD |
Product | The Rolex Daytona is a wonderful watch. |
VCL |
Vehicle | A Ferrari 250 GTO was the most expensive car ever sold. |
WEB |
Web address | Find the best technical documentation at docs.expert.ai. |
WRK |
Work of human intelligence | Grease is a funny musical romantic comedy. |
Examples
Consider the following example:
TYPE(NOU, ADJ)
The operand above matches two word classes: nouns (NOU
) like cat and adjectives (ADJ
) like strong.
The following operand, on the other hand:
TYPE(DAT, ADR)
matches entities of two types, dates (DAT
) and addresses (ADR
), when recognized in a text. See the topic about entity recognition for more information.
Warning
If used alone, the TYPE
attribute, can be hyper generative, so it's advisable to use it in conjunction with other attributes.
Sub-attributes
You can specify sub-attributes for word classes. These are grammatical features of the token such as the gender and number of nouns, the tense of verbs, the type of adverbs, etc.
For example, this rule:
SCOPE SENTENCE
{
DOMAIN(dom1)
{
TYPE(NOU:S)
}
}
applied to this text:
Mary went to the sea with her children.
will match sea, a singular noun, but not children.
You can specify one or more sub-attributes. The syntax is:
type:subAttribute1[:subAttribute2 ...]
Available sub-attributes for each language are described on the following pages.
When you specify more sub-attributes, be sure they belong to different lists of features of the TYPE
value.
For example, this rule with a value of NOU
:
SCOPE SENTENCE
{
IDENTIFY(NOUNS)
{
@MALE_SINGULAR_NOUNS[TYPE(NOU:M:S)]
}
}
has two sub-attributes:
so the attribute matches any noun that is both masculine and singular.
If you specify two sub-attributes belonging to the same list of features of the TYPE
value, like in this rule:
SCOPE SENTENCE
{
IDENTIFY(VERBS)
{
@REGULAR_VERBS[TYPE(VER:simple_past:ed_form)]
}
}
only the last sub-attribute will be considered.
A workaround is to specify more attrubutes with a sub-attribute each compined with the comma (,
), for example:
SCOPE SENTENCE
{
IDENTIFY(VERBS)
{
@REGULAR_VERBS[TYPE(VER:simple_past, VER:ed_form)]
}
}
The comma acts like an OR operator.