Skip to content


The methods of the LAY object that work on layout blocks are:


The getBlocksCount returns the number of blocks detected in the document layout.

The syntax is:


The method returns an integer representing the number of blocks.


The bounding box of each page, which contains all the pages blocks, is also counted as a block.


Given an ID, the getBlock method returns an object that represents the layout block with that ID.

Considering the example given in the introduction, the instruction:

var block = LAY.getBlock(3);

sets the variable block to an object corresponding to the block with ID 3. The object may look like this:

    "id": 3,
    "parent": 1,
    "pageNumber": 1,
    "type": "text",
    "x0": 67,
    "y0": 139,
    "x1": 152,
    "y1": 176,
    "children": [],
    "beginPos": 8,
    "endPos": 24,
    "tokenBegin": 1,
    "tokenEnd": 8,
    "wordBegin": 1,
    "wordEnd": 2,
    "label": ""


Field name Description Field type Default value
id A unique id associated to the block Integer -1
parent The id of the parent block Integer -1
pageNumber The page number in which the block is situated Integer -1
type The type of block (text, title, cell, and so on) String Empty string
label A label with some additional information on the block String Empty string
x0 The x-axis coordinate of the upper-left corner of the block, relative to the page Integer 0
y0 The y-axis coordinate of the upper-left corner of the block, relative to the page Integer 0
x1 The x-axis coordinate of the lower-right corner of the block, relative to the page Integer 0
y1 The y-axis coordinate of the lower-right corner of the block, relative to the page Integer 0
children The id of the blocks that are children of the block (like the blocks of a page or the cells of a table) List of Integers Empty array
beginPos The position in the text in which the block content starts Integer -1
endPos The position in the text in which the block content ends Integer -1
tokenBegin The index of the first token in the block content Integer -1
tokenEnd The index of the last token in the block content Integer -1
wordBegin The index of the first word in the block Integer -1
wordEnd The index of the last word in the block Integer -1

The syntax is:


where id is the block ID.


The getBlockText method returns the text contained in the block with the given ID, or undefined if the ID is not valid.


Block IDs are not zero-based so they start from 1.

For example the instruction:

var blockText = LAY.getBlockText(3);

sets a variable called blockText that, considering the Extract example output described in the dedicated page, is set to the following value:


The syntax is:


where id is the block ID.