Skip to content

Developer how-to

Programmatic use

The Natural Language API is a cloud-based service with a REST interface. This means that in order to use it a program must be able to access the Web and carry out an HTTP conversation with the API interface.
Whenever the program has to analyze a document, classify a document or detect information inside a document, it must request the most suitable API resource, similarly to what you do when you request the page of a site with a Web browser.

If you use one of the client packages available on GitHub, the details of the conversation are hidden, otherwise the program must use an HTTP client to request the API resources. In both cases what happens is the same: the program, via the—explicit or hidden—HTTP client, transmits a request to the API server.
The request contains the address of the resource of interest—its URL, or endpoint—and the text of the document. For this type of request the POST method is used.

Faced with this request, the server responds synchronously (after an amount of time depending on the type of processing requested and the complexity/length of the text) with the results of the processing.

If the program integrates an client package, the interpretation of the result is simple because it involves examining the properties of an object. Otherwise it must be considered that the result is a JSON object that should be "de-serialized" in order to parse it.

In this manual you will find all the information about the format of the request and the JSON objects that the API resources return, so you can easily create your own parser.

Authentication and authorization

Each API request must contain an authorization token. The bearer authentication mechanism is used, so the token must be obtained with an authentication operation and then specified as a header in each request.

The authentication operation is carried out by requesting—with a conversation identical to that described above—a special resource that is not strictly part of the API because it is shared by all the cloud services.

Its address is:

This resource must also be requested with the POST method and the body of the request must be a JSON object like this:

  "username": "yourusername",
  "password": "yourpassword"

with yourusername and yourpassword replaced by the developer credentials obtained by registering on the developer portal.

The Content-Type header of the request must be set to:

application/json; charset=utf-8

The response is the token and is a plain text like this:

eyJraWQiOiJlZXEzSnB5 ... CqJmhj2sLA

The application program must therefore "know" the credentials of the developer and obtain the authorization token through them.
If you use an client package, the details of the authentication and authorization dialogue are hidden. Otherwise, when the program requests API resources it must include the Authorization header in each request using this format:

Bearer token

with token replaced by the actual token.

As mentioned above, authorization tokens have a duration, they expire after a certain time.
If the application continues to make requests with an expired token it will get 401 Unauthorized errors. In that case it must request a new token to replace the old one.

OpenAPI specification

The Natural Language API is described by documents conforming to the OpenAPI specification.
These documents are human-readable, but are meant to be also interpreted by a machine. This allows, using special tools, to automatically generate client code.
The OpenAPI documents can therefore be considered contracts between the API supplier and the developer community: a client developed upon the OpenAPI specification "signs" the contract and is certain to be compatible with the API.

The main contract of the current version of the API is the nlapi.yaml file published in a GitHub repository.

The main contract covers all the generic use cases, while sub-contracts can exist that describe special uses of the API. The fact that they are described in separate contracts does not mean those parts are "outside" the API, but only that the main contract is not meant to be burdened with the description of special use cases.

For example, the geotax-w-geojson.yaml OpenAPI document describes a special use of the geotax classification resource, that is how to use the resource to have a GeoJSON format output and how the output is structured. Similarly, the pii.yaml OpenAPI document formally describes the specific output of the pii detector.

Request size limit

The maximum size of requests you can submit to the API is 10KB. If your text is larger, you can break it into chunks, analyze each chunk, and then merge the results. When doing this, try to make the single chunk as large as possible, i.e. as close to the size limit as possible. Also, try to start and end the chunk with whole sentences.
For merging results that have a score, such as categories, topics, key phrases, etc., you can create a list where the output labels appear only once and their scores are the sum of all scores obtained in each chunk.