Named entity recognition

Named entity recognition is a type of document analysis.
It determines which entities—persons, places, organizations, dates, addresses, etc.—are mentioned in a text and the attributes of the entity that can be inferred by semantic analysis.

Named entity recognition also performs knowledge linking: Knowledge Graph information and open data—Wikidata, DBpedia and GeoNames references—are returned for entities corresponding to syncons of the expert.ai Knowledge Graph. In the case of actual places, geographic coordinates are also provided.

Entities are also recognized in pronouns and shorter forms that refer to named mentions.
This kind of by reference recognition is anaphoric because entities are recognized through anaphoras.

For example in this text:

Michael Jordan was one of the best basketball players of all time.

Scoring was Jordan's stand-out skill, but he still holds a defensive NBA record, with eight steals in a half.

three mentions of Michael Jordan are recognized:

the full named mention: Michael Jordan
the anaphoras—Jordan and he—for which Michael Jordan is considered the antecedent.

Full analysis includes named entities recognition, but if you are not interested in the other analyses, you can use specific resources having paths like this:

analyze/context name/language code/entities

Boxed parts are placeholders, so for example:

https://nlapi.expert.ai/v2/analyze/standard/en/entities

is the URL of the standard context resource performing named entity recognition on an English text.
These resources must be requested with the POST method.

In the reference section of this manual you will find all the information you need to perform named entity recognition using the API's RESTful interface, specifically:

The format of the request to be submitted to the resources.
How to build resources' paths and full endpoints.
The output format.

Note

Even if you consume the API through a ready-to-use client that hides low-level requests and responses, knowing the output format helps you understand and navigate the results.

Here is an example of performing named entity recognition on a short English test:

PythonNodeJSJavacurl (Linux)curl (Windows)

This example code uses expertai-nlapi, the open-source Python client corresponding to the nlapi-python GitHub project.

The client gets user credentials from two environment variables:

EAI_USERNAME
EAI_PASSWORD

Set those variables with your account credentials before running the sample program below.

The program prints the list of entities with their type.

from expertai.nlapi.cloud.client import ExpertAiClient
client = ExpertAiClient()

text = "Michael Jordan was one of the best basketball players of all time. Scoring was Jordan's stand-out skill, but he still holds a defensive NBA record, with eight steals in a half."
language= 'en'

output = client.specific_resource_analysis(body={"document": {"text": text}}, params={'language': language, 'resource': 'entities'})

print (f'{"ENTITY":{50}} {"TYPE":{10}}')
print (f'{"------":{50}} {"----":{10}}')

for entity in output.entities:
    print (f'{entity.lemma:{50}} {entity.type_:{10}}')

This example code uses @expertai/nlapi, the open-source NodeJS client corresponding to the nlapi-nodejs GitHub project.

The client gets user credentials from two environment variables:

EAI_USERNAME
EAI_PASSWORD

Set those variables with your account credentials before running the sample program below.

The program prints a table containing the lemma and the type for each entity.

import {NLClient} from "@expertai/nlapi";
import {Language} from "@expertai/nlapi";
import {Analysis} from "@expertai/nlapi";

var nlClient = new NLClient();

var text = "Michael Jordan was one of the best basketball players of all time. Scoring was Jordan's stand-out skill, but he still holds a defensive NBA record, with eight steals in a half.";

nlClient.analyze(text, {
  language: Language.EN,
  context: "standard",
  analysis: Analysis.Entities
}).then((result) => {
    console.log("Named entities with their type:");
    console.table(result.data.entities, ["lemma", "type"]);
})

This example code uses nlapi-java-sdk, the open-source Java client corresponding to the nlapi-java GitHub project.

The client gets user credentials from two environment variables:

EAI_USERNAME
EAI_PASSWORD

Set those variables with your account credentials before running the sample program below.

The program prints the JSON response and the list of entities with their type.

import ai.expert.nlapi.security.Authentication;
import ai.expert.nlapi.security.Authenticator;
import ai.expert.nlapi.security.BasicAuthenticator;
import ai.expert.nlapi.security.DefaultCredentialsProvider;
import ai.expert.nlapi.v2.API;
import ai.expert.nlapi.v2.cloud.Analyzer;
import ai.expert.nlapi.v2.cloud.AnalyzerConfig;
import ai.expert.nlapi.v2.message.AnalyzeResponse;
import ai.expert.nlapi.v2.model.AnalyzeDocument;

public class Main {

    public static Authentication createAuthentication() throws Exception {
        DefaultCredentialsProvider credentialsProvider = new DefaultCredentialsProvider();
        Authenticator authenticator = new BasicAuthenticator(credentialsProvider);
        return new Authentication(authenticator);
    }

    public static Analyzer createAnalyzer() throws Exception {
        return new Analyzer(AnalyzerConfig.builder()
                .withVersion(API.Versions.V2)
                .withContext("standard")
                .withLanguage(API.Languages.en)
                .withAuthentication(createAuthentication())
                .build());
    }

    public static void main(String[] args) {
        try {
            String text = "Michael Jordan was one of the best basketball players of all time. Scoring was Jordan's stand-out skill, but he still holds a defensive NBA record, with eight steals in a half.";

            Analyzer analyzer = createAnalyzer();

            AnalyzeResponse entities = analyzer.entities(text);


            // Output JSON representation

            System.out.println("JSON representation:");
            entities.prettyPrint();


            // Tab separated list of entitites' lemma and type.

            System.out.println("Tab separated list of entities' lemma and type:");
            AnalyzeDocument data = entities.getData();
            data.getEntities().stream().forEach(c -> System.out.println(c.getLemma() + "\t" + c.getType()));
        }
        catch(Exception ex) {
            ex.printStackTrace();
        }
    }
}

The following curl command posts a document to the named entity recognition resource of the API's REST interface.
Run the command from a shell after replacing token with the actual authorization token.

curl -X POST https://nlapi.expert.ai/v2/analyze/standard/en/entities \
    -H 'Authorization: Bearer token' \
    -H 'Content-Type: application/json; charset=utf-8' \
    -d '{
  "document": {
    "text": "Michael Jordan was one of the best basketball players of all time. Scoring was Jordan'\''s stand-out skill, but he still holds a defensive NBA record, with eight steals in a half."
  }
}'

The server returns a JSON object.

The following curl command posts a document to the named entity recognition resource of the API's REST interface.
Open a command prompt in the folder where you installed curl and run the command after replacing token with the actual authorization token.

curl -X POST https://nlapi.expert.ai/v2/analyze/standard/en/entities  -H "Authorization: Bearer token" -H "Content-Type: application/json; charset=utf-8" -d "{\"document\": {\"text\": \"Michael Jordan was one of the best basketball players of all time. Scoring was Jordan's stand-out skill, but he still holds a defensive NBA record, with eight steals in a half.\"}}"

The server returns a JSON object.