ENTRY transforms what is matched by the attribute into its most significant base form. This option is used to normalize output values known to the Knowledge Graph.
The syntax for extraction rule is:
The syntax for tagging rules is:
To use this option, the attribute chosen should be among those capable of recognizing elements found within the Knowledge Graph (see
LIST excluding the
UNKNOWN elements: syncon and ancestor).
ENTRY option returns a constant form when a concept (syncon) is identified in a text in all its possible forms and variations contained in the Knowledge Graph. The constant form returned corresponds to the base form of the syncon's "main lemma". The main lemma is the most representative word for a given concept. In other words, in a syncon that contains many lemmas, the main lemma is the most commonly referred to word. This parameter is associated to a lemma during the Knowledge Graph enrichment phase; the main lemma, therefore, can be considered as a predefined attribute of a lemma.
Consider the following example:
@COMPANY_NAME[ANCESTOR(37475) + TYPE(NPR) - SYNCON(UNKNOWN)]|[ENTRY]
The purpose of this rule is to extract a chain of proper noun concepts (
+ TYPE(NPR)) starting from syncon 37475 (company), only if the identified concepts are not "unknown" to the Knowledge Graph (
- SYNCON(UNKNOWN)); in other words, extract proper names of companies found within the Knowledge Graph. If this condition is verified, the
ENTRY transformation option will ensure that every form that a company's name can take is transformed into the syncon main lemma. This allows a concept to have one consistent extraction value even though the concept may appear in several different forms in a text.
Consider the extraction output if the above rule is run against the following sample sentence:
The equities index is 20 percent above its level on Sept. 15, 2008, the first trading day after Lehman Brothers Holdings Inc. filed the world's biggest bankruptcy and prompted a 46 percent drop through March 9, 2009.
Lehman Brothers is having a great year. The bank, which almost destroyed the global economy four years ago this week, recently emerged from bankruptcy, resolved a third of its debts and executed the largest U.S. real estate deal of the year.
The text contains two values matching the sample rule: Lehman Brothers Holdings Inc. and Lehman Brothers, which are both analyzed as companies. The disambiguator also recognizes these two names as the same company associated to syncon 317862; in the Knowledge Graph, this syncon contains five different forms referring to the same concept. The extraction panel shows that the extracted value is its main lemma, Lehman Brothers, while the text record shows the two instances found in the text: Lehman Brothers Holdings Inc. and Lehman Brothers. This means that the extracted values have been transformed—and normalized—into the main lemma thanks to the
Please note that if the
ENTRY option is used with a value not contained in the Knowledge Graph, but which received a virtual supernomen, the value returned will be the main lemma of its virtual supernomen (see
SMARTENTRY for more details).