Document classification output
The document classification resource returns a JSON object with this format:
{
"success": Boolean success flag,
"data": {
"content": analyzed text,
"language": language code,
"version": technology version info,
"categories": []
}
}
For the description of the contents
, language
and version
properties, see output overview.
Each item of the categories
array represents a category, for example:
{
"frequency": 70.62,
"hierarchy": [
"Sport",
"Competition discipline",
"Basketball"
],
"id": "20000851",
"label": "Basketball",
"namespace": "iptc_en_1.0",
"positions": [
{
"end": 14,
"start": 0
},
{
"end": 53,
"start": 35
},
{
"end": 139,
"start": 136
}
],
"score": 4005.0,
"winner": true
}
namespace
is the name of the software module carrying out document classification inside the text intelligence engine.id
,label
andhierarchy
identify the category in the categories' tree.score
is the cumulative score that was attributed to the category.-
frequency
is the percentage ratio of the category score to the sum of all categories' scores.Info
Note that the sum of the frequencies of all categories could be less than 100. This occurs when the text intelligence engine is configured to filter out the "losers" categories. that is, those with the lowest scores.
For further information on the topic of category score, consult the Studio documentation. -
winner
is a Boolean flag set totrue
if the category was considered particularly relevant. positions
is an array containing the positions of the text blocks that contributed to category score.