Skip to content



Extract Beta REST API exposes:

  • A resource to start an asynchronous recognition task.
  • A resource for each running recognition task by which you can determine the status of the task or, if the task is finished, obtain task results.

The entire API is formally described in the OpenAPI specification.

The resource to get the status of a recognition task or its final outcome is:

where taskID is the identification code of the recognition task previously started with layout-document-async.

It must be requested with the GET verb.
The Authorization header must be set as described in the developer's how-to.

Task in progress response

The response is an UTF-8 encoded JSON object.
Until the layout recognition task is running, the object has this structure:

    "current": percentage of work done,
    "message": "task phase",
    "state": "task status"

task status can be PENDING, if the task has been queued but not yet started, or PROGRESS if the task has started.

Successful end-of-task response

When the task has finished without errors, the structure of the object is like this:

    "current": 100,
    "message": "completed",
    "result": {
        "header": {
            "conversionDateTime": "task end date and time",
            "customInfo": {
                "property 1 name": "property value",
                "property 2 name": "property value",
                "property n name": "property value",
            "documentName": "document name",
            "errorPages": number of pages that were not analyzed,
            "totPages": number of pages,
            "version": "engine version",
            "metadata": [
                metadata object 1,
                metadata object 2,
                metadata object n
        "layout": [
            layout object 1,
            layout object 2,
            layout object n
        "words": [
            "page 1 words",
            "page 2 words",
            "page n words"
    "state": "SUCCESS"

Find the detailed description of this response in the dedicated article.

Managed error response

If the task finishes due to an error, the response object has this structure:

    "current": 0,
    "message": "error message",
    "state": "FAILURE"

The value of error message is the description of the reason for task failure.

HTTP status codes

Here are the possible HTTP status code that Extract Beta can return when requiring the status resource:

200 OK

The request succeeded.

401 Unauthorized

The reasons for this state can be:

  • invalid credentials were specified when requesting the authorization token
  • the authorization token is missing
  • the authorization token is not valid
  • the authorization token has expired

403 Forbidden

This code is returned is the user does not have an active subscription.

404 Not Found

The server cannot find the requested resource, the URL is wrong.

405 Method Not Allowed

The request method is known by the server but is not supported by the target resource. For example, you requested the resource with POST instead of GET.

413 Request Entity Too Large

The request is larger than the limit defined by the plan the user subscribed.

500 Internal Server Error

The server has encountered a situation it doesn't know how to handle.