Document classification output
The APi resources performing document classification return a JSON object with this format:
{
"success": Boolean success flag,
"data": {
"content": analyzed text,
"language": language code,
"version": technology version info,
"categories": []
}
}
For the description of the contents
, language
and version
properties, see the API resources output overview.
Each item of the categories
array represents a category, for example:
{
"frequency": 70.62,
"hierarchy": [
"Sport",
"Competition discipline",
"Basketball"
],
"id": "20000851",
"label": "Basketball",
"namespace": "iptc_en_1.0",
"positions": [
{
"end": 14,
"start": 0
},
{
"end": 53,
"start": 35
},
{
"end": 139,
"start": 136
}
],
"score": 4005.0,
"winner": true
}
namespace
is the name of the software package containing the reference taxonomy.id
,label
andhierarchy
identify the category.score
is the cumulative score that was attributed to the category.frequency
is a percentage and an alternative measure of score that's easier to interpret when results need to be filtered based on the relative score difference. For example, if category #1 has frequency 50, category #2 has frequency 40 and category #3 has frequency 10, a filtering criteria like: "exclude categories with a frequency that's more than 10% lower than the highest" would reject category #3.winner
is a Boolean flag set totrue
if the category was considered particularly relevant.positions
is an array containing the positions of the text blocks that contributed to category score.