Skip to content

Results

Successful end-of-task response

When a recognition task has finished without errors, the status resource returns a JSON object like this:

{
    "current": 100,
    "message": "completed",
    "result": {
        "header": {
            "conversionDateTime": "task end date and time",
            "customInfo": {
                "property 1 name": "property value",
                "property 2 name": "property value",
                ...
                "property n name": "property value",
            },
            "documentName": "document name",
            "errorPages": number of pages that were not analyzed,
            "totPages": total number of pages,
            "version": "engine version",
            "metadata": [
                metadata object 1,
                metadata object 2,
                ...
                metadata object n
            ]
        },
        "layout": [
            layout object 1,
            layout object 2,
            ...
            layout object n
        ],
        "words": [
            "page 1 words",
            "page 2 words",
            ...
            "page n words"
        ]
    },
    "state": "SUCCESS"
}

current is the percentage completion of the task.

message is the message indicating the phase of the task.

result is the object that contains the results, detailed below.

state is the status of the task.

The header object contains information about the whole document.

conversionDateTime
conversionDateTime is the date and time the detection task ended.

customInfo
The properties of the customInfo object correspond to the properties of the PDF document. Most common properties are:

  • Author: author
  • CreationDate: creation date and time1
  • Creator: creator
  • ModDate: last modification date and time1
  • Producer: generator application

documentName
documentName is the document name.

errorPages
errorPages is the number of pages that could not be analyzed.

totPages
totPages is the total number of pages.

version
version is the version of the software module that performed the detection task.

metadata
metadata is an array of PDF metadata, for example:

"metadata": [{
        "bbox": [146, 207, 419, 228],
        "key": "txtPolicyNumber",
        "page": 3,
        "value": "PACUIC001101-07 "
    }, {
        "bbox": [39, 426, 417, 357],
        "key": "txtNamedInsuredAndAddress",
        "page": 3,
        "value": "SWEET FRUIT ASSOCIATION INC.\r\n7100 APRICOT WAY\r\nST. PETERSBURG, FL  33706 "
   }, {
        "bbox": [421, 356, 829, 438],
        "key": "AgencyNameAndAddress",
        "page": 3,
        "value": "StaySafe Insurance Services, Inc.\r\n2502 N Rodeo Drive\r\nTampa, FL  33607 "
    }, {
        "bbox": [144, 254, 283, 275],
        "key": "txtEffectiveDate",
        "page": 3,
        "value": "4/27/2022 "
    }
]

Metadata is optional data that the PDF editor can insert into pages. This data is not displayed on the page but is associated with visible elements.

Each metadata can have these properties:

  • bbox: array containing the coordinates of the metadata bounding box.

    • item 0: upper left corner X
    • item 1: upper left corner Y
    • item 2: lower right corner X
    • item 3: lower right corner Y

    Coordinates are in pixels and referred to a 100 DPI (dots per inch) rendering of the page. The coordinates origin is at the top left corner of the rendered page.

  • key: name of metadata key

  • page: page number where the metadata is located
  • value: metadata value

layout

layout is an array containing all the layout elements recognized in the document.
The order of the elements inside the array reflects the sequence of pages, so all the elements of page 1 are found first, then those of page 2, and so on.
Within the elements of a page, the first element represents the page itself and the other elements are blocks of text, tables or table cells. The position of text blocks and tables in the array corresponds to what Extract Beta assumed to be the order in which a human would read them on the page.
Each item in the array is an object with these properties:

  • id: block ID. Every block of text as a unique ID which can be referenced in the children or in the parent properties of other blocks.
  • page: page number
  • children: list of child blocks. This property is an array, each item of which is the ID of an element that is hierarchically a child of this element. For example, the titles in a page are children of the page element, the cells of a table are the children of a table element.
  • type: element type, can be page, title, text, header, footer, table or cell.
  • parent: parent element ID. In the case of table cells (type set to cell), the value of this property is the ID of the table element, while for title, text, header and footer blocks is the page element. Page elements don't have this property.
  • label: element label. This is an experimental feature and must be ignored.
  • content: block text, this property is absent in page and table elements.
  • bbox: array containing the coordinates of the element's bounding box.

    • item 0: upper left corner X
    • item 1: upper left corner Y
    • item 2: lower right corner X
    • item 3: lower right corner Y

    Coordinates are in pixels and referred to a 100 DPI (dots per inch) rendering of the page. The coordinates origin is at the top left corner of the rendered page.

  • Only for cell elements (type set to cell):

    • row: cell row number.
    • column: cell column number.
    • isHead: set to true if the cell is a column header.
    • span: cell span. It's an array of integer numbers. When present, the cell spans over more than one row and/or columns. The first item of the array is the row span, the second is the column span.

words

The words array contains one item per page and each item represents, in an encoded and compressed form, all the words present on the page.

The value of the single item is encoded in Base64.
The decoded value is a byte array in gzip format. The expanded byte array value is another byte array in which each word corresponds to a variable-length sequence of bytes with this structure:

UTF-8 encoded text0x00Parent element IDBounding box coordinates

Parent element ID is four bytes long and must be interpreted as a little-endian integer. The value is the ID of the layout element in which the word is located.
Bounding box coordinates is 16 bytes long and consists of four parts of four bytes each. Each part must be interpreted as a little-endian integer. The parts are the coordinates of the word bounding box and, taken from left to right, have this meaning.

1. upper left corner X
2. upper left corner Y
3. lower right corner X
4: lower right corner Y

  1. PDF defines a standard date format similar to the international standard Abstract Syntax Notation One (ASN.1), defined in ISO/IEC 8824. A date-time is a string with this format:

    D:YYYYMMDDHHmmSSOHH'mm'
    

    where:

    • YYYY is the year
    • MM is the month
    • DD is the day (01-31)
    • HH is the hour (00-23)
    • mm is the minute (00-59)
    • SS is the second (00-59)
    • O is the relationship of local time to Universal Time (UT), denoted by one of the characters +, -, or Z (see below)
    • HH followed by ' is the absolute value of the offset from UT in hours (00-23)
    • mm followed by ' is the absolute value of the offset from UT in minutes (00-59)

    A plus sign (+) as the value of the O field signifies that local time is later than UT, a minus sign (-) that local time is earlier than UT, and the letter Z that local time is equal to UT. If no UT information is specified, the relationship of the specified time to UT is considered to be unknown. Whether or not the time zone is known, the rest of the date is specified in local time.
    For example, December 23, 2022, at 7:52 PM, U.S. Pacific Standard Time, is represented by the string,

    D:20221223195200-08'00'
    

    OR

    D:20220327195230+05'00'