Skip to content

Taxonomies

Introduction

A taxonomy is the set of categories that a document classification resource can detect.
In the expert.ai Natural Language API, the taxonomy name is used to identity document classification resources.

Several classification resources can exist for the same taxonomy, each supporting a different language. The complete endpoint of a classification resource must thus contain both the taxonomy name and the language.

To date, the API exposes classification resources for two taxonomies.
The table below shows taxonomy names and the supported languages.

Taxonomy name English Spanish French German Italian
iptc
geotax

For more information about the categories of each taxonomy see the article in the reference section. You can also use self-documentation resources to have the categories' tree for a given taxonomy-language couple.

iptc taxonomy

The classification resources corresponding to the iptc taxonomy classify texts in terms of IPTC Media Topics subject codes.
This type of classification is particularly suited for news.

geotax taxonomy

The classification resources corresponding to the geotax taxonomy classify texts in terms of countries' names. They detect geographic places cited in the text and infer the corresponding countries.

The same resources, when requested with a specific query-string parameter, return countries' information in terms of GeoJSON data.

Self-documentation resources

taxonomies

The API provides a self-documentation resource to discover available taxonomies and their features. It has this path:

taxonomies

and must be requested with the GET method.
It returns the list of available taxonomies along with the supported languages—as in the above table.

In the reference section of this manual you will find all the information you need to get taxonomies information using the API's RESTful interface, specifically:

Even if you use the API through a client that hides the REST interface, whether it is made by you or offered by expert.ai, the last piece of information is useful as it helps understand the data returned by the API.

Here is an example of getting taxonomies information:

This example is based on the Python client you can find on GitHub.

The client gets user credentials from two environment variables:

EAI_USERNAME
EAI_PASSWORD

Set those variables with your account credentials before running the sample program below.

The program prints the list of taxonomies with the language they support.

from expertai.nlapi.cloud.client import ExpertAiClient

client = ExpertAiClient()

output = client.taxonomies()

print("Taxonomies:")

for taxonomy in output.taxonomies:
    print(taxonomy.name)
    print("\tLanguages:")
    for language in taxonomy.languages:
        print("\t", language.code)

This example is based on the Java client you can find on GitHub.

The client gets user credentials from two environment variables:

EAI_USERNAME
EAI_PASSWORD

Set those variables with you account credentials before running the sample program below.

The program prints the JSON response.

import ai.expert.nlapi.security.Authentication;
import ai.expert.nlapi.security.Authenticator;
import ai.expert.nlapi.security.BasicAuthenticator;
import ai.expert.nlapi.security.DefaultCredentialsProvider;
import ai.expert.nlapi.v2.API;
import ai.expert.nlapi.v2.message.TaxonomiesResponse;
import ai.expert.nlapi.v2.InfoAPI;
import ai.expert.nlapi.v2.InfoAPIConfig;

public class Main {

    public static Authentication createAuthentication() throws Exception {
        DefaultCredentialsProvider credentialsProvider = new DefaultCredentialsProvider();
        Authenticator authenticator = new BasicAuthenticator(credentialsProvider);
        return new Authentication(authenticator);
    }

    public static void main(String[] args) {
        try {
            InfoAPI infoAPI = new InfoAPI(InfoAPIConfig.builder()
               .withAuthentication(createAuthentication())
               .withVersion(API.Versions.V2)
               .build());

            TaxonomiesResponse taxonomies = infoAPI.getTaxonomies();
            taxonomies.prettyPrint();
        }
        catch(Exception ex) {
            ex.printStackTrace();
        }
    }
}

The following curl command gets the taxonomies documentation resource of the API's REST interface.
Run the command from a shell after replacing token with the actual authorization token.

curl -X GET https://nlapi.expert.ai/v2/taxonomies \
    -H 'Authorization: Bearer token'

The server returns a JSON object.

The following curl command gets the taxonomies documentation resource of the API's REST interface.
Open a command prompt in the folder where you installed curl and run the command after replacing token with the actual authorization token.

curl -X GET https://nlapi.expert.ai/v2/taxonomies -H "Authorization: Bearer token"

The server returns a JSON object.

taxonomies child resources

The API also provides self-documentation resources that return the categories' tree of specific taxonomies. They have paths like this:

taxonomies/taxonomy/language

and must be requested with the GET method. Boxed parts are placeholders.

In the reference section of this manual you will find all the information you need to get taxonomy information using the API's RESTful interface, specifically:

Even if you use the API through a client that hides the REST interface, whether it is made by you or offered by expert.ai, the last piece of information is useful as it helps understand the data returned by the API.

Here is an example of getting the categories' tree for a taxonomy-language couple:

This example is based on the Python client you can find on GitHub.

The client gets user credentials from two environment variables:

EAI_USERNAME
EAI_PASSWORD

Set those variables with your account credentials before running the sample program below.

The program prints the list of taxonomies with the language they support.

from expertai.nlapi.cloud.client import ExpertAiClient

def printCategory(level, category):
    tabs = "\t" * level
    print(tabs, category.id, "(", category.label, ")")
    for nestedCategory in category.categories:
        printCategory(level + 1, nestedCategory)

client = ExpertAiClient()

taxonomy='geotax'
language='en'

output = client.taxonomy(params={'taxonomy': taxonomy, 'language': language})

print("geotax categories' tree:")

for category in output.taxonomy[0].categories:
    printCategory(1, category)

This example is based on the Java client you can find on GitHub.

The client gets user credentials from two environment variables:

EAI_USERNAME
EAI_PASSWORD

Set those variables with your account credentials before running the sample program below.

The program prints the JSON response.

import ai.expert.nlapi.security.Authentication;
import ai.expert.nlapi.security.Authenticator;
import ai.expert.nlapi.security.BasicAuthenticator;
import ai.expert.nlapi.security.DefaultCredentialsProvider;
import ai.expert.nlapi.v2.API;
import ai.expert.nlapi.v2.InfoAPI;
import ai.expert.nlapi.v2.InfoAPIConfig;
import ai.expert.nlapi.v2.message.TaxonomyResponse;

public class Main {

    public static Authentication createAuthentication() throws Exception {
        DefaultCredentialsProvider credentialsProvider = new DefaultCredentialsProvider();
        Authenticator authenticator = new BasicAuthenticator(credentialsProvider);
        return new Authentication(authenticator);
    }

    public static void main(String[] args) {
        try {
            InfoAPI infoAPI = new InfoAPI(InfoAPIConfig.builder()
               .withAuthentication(createAuthentication())
               .withVersion(API.Versions.V2)
               .build());

            TaxonomyResponse taxonomy = infoAPI.getTaxonomy("geotax", API.Languages.en);
            taxonomy.prettyPrint();
        }
        catch(Exception ex) {
            ex.printStackTrace();
        }
    }
}

The following curl command gets the resource of the API's REST interface that returns the categories' tree of the English geotax taxonomy. Run the command from a shell after replacing token with the actual authorization token.

curl -X GET https://nlapi.expert.ai/v2/taxonomies/geotax/en \
    -H 'Authorization: Bearer token'

The server returns a JSON object.

The following curl command gets the resource of the API's REST interface that returns the categories' tree of the English geotax taxonomy. Open a command prompt in the folder where you installed curl and run the command after replacing token with the actual authorization token.

curl -X GET https://nlapi.expert.ai/v2/geotax/en -H "Authorization: Bearer token"

The server returns a JSON object.