Full document analysis output
The full analysis resource returns a JSON object with this format:
{
"success": Boolean success flag,
"data": {
"content": analyzed text,
"language": language code,
"version": technology version info,
"knowledge": [],
"paragraphs": [],
"sentences": [],
"phrases": [],
"tokens": [],
"mainSentences": [],
"mainPhrases": [],
"mainLemmas": [],
"mainSyncons": [],
"topics": [],
"entities": [],
"relations": [],
"sentiment": {}
}
}
For the description of the contents
, language
and version
properties, see the API output overview.
Components arrays have the same structure they have in the response of the resource that performs the corresponding process, so:
-
For:
paragraphs
sentences
phrases
tokens
arrays see the format of deep linguistic analysis output.
-
For:
mainSentences
mainPhrases
mainLemmas
mainSyncons
topics
arrays see the format of keyphrase extraction output.
-
For:
entities
array see the format of named entity recognition output.
-
For:
relations
array see the format of relation extraction output.
-
For:
sentiment
object see the format of sentiment analysis output.
knowledge
The knowledge
array contains Knowledge Graph data information about syncons.
Items in these arrays:
tokens
manSyncons
entities
relations
items
(in thesentiment
object)
can have a syncon
property: the link between those items and the corresponding items in the knowledge
array is thus represented by the value of the syncon
property both items have in common.
For example, if this is an item of the tokens
array:
{
"atoms": [
{
"end": 45,
"lemma": "basketball",
"start": 35,
"type": "NOU"
},
{
"end": 53,
"lemma": "player",
"start": 46,
"type": "NOU"
}
],
"dependency": {
"head": 2,
"id": 6,
"label": "nmod"
},
"end": 53,
"lemma": "basketball player",
"morphology": "Number=Plur",
"paragraph": 0,
"phrase": 2,
"pos": "NOUN",
"sentence": 0,
"start": 35,
"syncon": 41583,
"type": "NOU"
}
the corresponding entry in the knowledge
array can be:
{
"label": "person.athlete.basketball_player",
"properties": [
{
"type": "WikiDataId",
"value": "Q3665646"
}
],
"syncon": 41583
}
It can be a "many-to-one" relationship since more than one item in the tokens
, relations
and sentiment items
arrays can have the same syncon ID, but there's always one entry in the knowledge
array for a given syncon, so the knowledge
array is a reference table.
For example, if a text contains several occurrences of basketball player, each occurrence corresponds to a separate item in the tokens
array, but all tokens "point" to the same entry in the knowledge
array.
Items with the syncon property set to -1 have no corresponding entry in the knowledge
array. This is because they are concepts recognized through heuristics and are not present in the Knowledge Graph.
Each entry in the array has a format like this:
{
"label": "person",
"properties": [
{
"type": "WikiDataId",
"value": "Q215627"
}
],
"syncon": 73282
}
The label
property is a textual rendering of the general conceptual category for the syncon in the Knowledge Graph.
The properties
array contains the outcome of knowledge linking. Each item has two properties, type
and value
.
type
specifies the knowledge base, value
is the property value.
Possible knowledge bases and interpretations of the value
property follow.
type |
value |
---|---|
Coordinate |
Latitude and longitude |
WikiDataId |
Wikipedia article ID |
DBpediaId |
URL of the DBPedia content |
GeoNamesId |
ID of the record in the GeoNames database |