Blocks
The methods of the LAY
object that work on layout blocks are:
getBlocksCount
The getBlocksCount
returns the number of blocks detected in the document layout.
The syntax is:
LAY.getBlocksCount()
The method returns an integer representing the number of blocks.
Info
The bounding box of each page, which contains all the pages blocks, is also counted as a block.
getBlock
Given an ID, the getBlock
method returns an object that represents the layout block with that ID.
Considering the example given in the introduction, the instruction:
var block = LAY.getBlock(3);
sets the variable block
to an object corresponding to the block with ID 3.
The object may look like this:
{
"id": 3,
"parent": 1,
"pageNumber": 1,
"type": "text",
"x0": 67,
"y0": 139,
"x1": 152,
"y1": 176,
"children": [],
"beginPos": 8,
"endPos": 24,
"tokenBegin": 1,
"tokenEnd": 8,
"wordBegin": 1,
"wordEnd": 2,
"label": ""
}
where:
Field name | Description | Field type | Default value | ||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
id |
A unique id associated to the block | Integer | -1 | ||||||||||||||||||
parent |
The id of the parent block | Integer | -1 | ||||||||||||||||||
pageNumber |
The page number in which the block is situated | Integer | -1 | ||||||||||||||||||
type |
The type of block:
|
String | Empty string | ||||||||||||||||||
label |
A label with some additional information on the block | String | Empty string | ||||||||||||||||||
x0 |
The x-axis coordinate of the upper-left corner of the block, relative to the page | Integer | 0 | ||||||||||||||||||
y0 |
The y-axis coordinate of the upper-left corner of the block, relative to the page | Integer | 0 | ||||||||||||||||||
x1 |
The x-axis coordinate of the lower-right corner of the block, relative to the page | Integer | 0 | ||||||||||||||||||
y1 |
The y-axis coordinate of the lower-right corner of the block, relative to the page | Integer | 0 | ||||||||||||||||||
children |
The id of the blocks that are children of the block (like the blocks of a page or the cells of a table) | List of Integers | Empty array | ||||||||||||||||||
beginPos |
The position in the text in which the block content starts | Integer | -1 | ||||||||||||||||||
endPos |
The position in the text in which the block content ends | Integer | -1 | ||||||||||||||||||
tokenBegin |
The index of the first token in the block content | Integer | -1 | ||||||||||||||||||
tokenEnd |
The index of the last token in the block content | Integer | -1 | ||||||||||||||||||
wordBegin |
The index of the first word in the block | Integer | -1 | ||||||||||||||||||
wordEnd |
The index of the last word in the block | Integer | -1 |
The syntax is:
LAY.getBlock(id);
where id
is the block ID.
getBlockText
The getBlockText
method returns the text contained in the block with the given ID, or undefined
if the ID is not valid.
Info
Block IDs are not zero-based so they start from 1.
For example the instruction:
var blockText = LAY.getBlockText(3);
sets a variable called blockText
that, considering the Extract example output described in the dedicated page, is set to the following value:
DATE:
10/21/2015
The syntax is:
LAY.getBlockText(id);
where id
is the block ID.