Skip to content

Interpret semantic analysis outcome

Introduction

After the analysis of a test document, the combined outputs of deep linguistic analysis (also called disambiguation) and tagging are shown in the Semantic Analysis tool window.
All rules (categorization, extraction, etc.) are built and act upon these outputs, so their interpretation for test documents is fundamental to understand how to build and tune rules for real life cases.

Note

The Semantic Analysis tool window shows the results of the last analysis. If the source code and/or the test file contents change, it is necessary to perform a new analysis.

Set the scope

The extent of the results displayed in the tool window is determined by the value of the Scope drop-down menu on the toolbar.
By default it is SENTENCE and the tool window shows the results for the sentence on which the cursor is located inside the test document.

When scope is PARAGRAPH, the tool window shows the sentences that compose the paragraph separated by a vertical bar.

To show the results for another range of text—sentence or paragraph—, simply select it inside the test document.

Interpret layers

The results of the deeep linguistic analysis and tagging are represented by layers of colored tiles, which, from top to bottom, correspond to output elements of increasingly finer granularity.

Layers are:

Layer Color Description
Sentences Tiles represent sentences, including final punctuation.
Clauses Tiles represent clauses, the smallest grammatical units expressing a proposition.
Phrases Tiles represent phrases (for example noun phrase, verb phrase, etc.).
Syntax Tiles represent the syntactic role—subject, action, object—of phrases.
Tags Tiles represent the output tags.
Words Tiles correspond to words, collocations or punctuation.
Atoms Tile represents atoms: in case of words and punctuation, they correspond to the tiles of the layer above, while in case of collocations, they correspond to the words that compose the collocation.
Text elements Tiles represent the portions of text corresponding to the atoms of the layer above.

Info

The syntax layer (yellow) is hidden by default.

You can change tiles' color: find the corresponding configuration settings in Studio > Studio Settings > Tool Windows > Semantic Analysis.

Toggle layers

All but the three lowest layers can be turned on and off.
To toggle a layer just select the layer color on the toolbar.

Use tiles

Show the tile details and highlight text

To show the tile details and highlight the corresponding text inside the test document in the editor, simply select the tile.
Tile details are shown in a dedicated panel on the right of the tool window.

Copy tile label

To copy the tile label, right-click the tile and choose Copy Label.

Show syncon IDs

You can show the syncon IDs inside text elements tiles (white layer) and, for collocation, word tiles (tomato) that correspond to Knowledge Graph concepts identified by the disambiguator.

To toggle base syncon ID visibility select Toggle Syncon IDs on the toolbar.

Sycons may have more than one identifier: to toggle the visibility of additional IDs select Toggle Syncon Multiple IDs on the toolbar.

Note

If Toggle Syncon Multiple IDs is on, Toggle Syncon IDs is automatically turned on.

Interpret clause tiles

The labels of the tiles in the clauses layer (color: yellow green ) represent these types of clauses:

Label Type
INDEPENDENT A clause containing both subject and predicate. It can be either a simple sentence or the main clause of a complex sentence that can stand alone.
SUBORDINATE Dependent clause that adds information to an independent clause, but which cannot stand alone as a sentence.
RELATIVE Subordinate clause whose arguments share a referent with a main clause element (introduced by a relative pronoun).
NON-FINITE Subordinate clause whose verb is non-finite.
PREPOSITIONAL Subordinate clause introduced by a preposition.
CAUSAL Subordinate clause that provides the reason or cause of the fact stated in the independent clause.
TEMPORAL Subordinate clause that indicates an act or state that occurs prior to, at the same time as, or subsequent to the act or state of the main clause.
COMPARATIVE Subordinate clause that follows the comparative form of an adjective or adverb.
PURPOSE Clause expressing purpose, also referred to as final clause.
SUJECT_OBJECT Subordinate noun clause acting as subject or object of the preposition.
NON_FINITE Subordinate noun clause whose verb is non-finite.
CONDITIONAL Subordinate noun clause expressing factual implications, or hypothetical situations and their consequences (introduced by if).
CONSECUTIVE Subordinate clause which expresses the result of the action stated in the main clause or a preceding sentence.
CONCESSIVE Subordinate clause which expresses an opposite idea compared to the main clause.
ADVERSATIVE Corresponds to a kind of subordinate clause which expresses an event or situation that is opposite to that of the main clause (introduced by but).
PARENTHETIC Corresponds to a kind of clause, often explanatory or qualifying, inserted into a passage with which it is not grammatically connected, and marked off by brackets, dashes, etc.
Blank tile Denotes text portions that are not real clauses.

Example

Considering the sentence :

If you are a whiskey lover, you will know that it is a spirit produced from fermented grain and aged in the wood.

tiles in the clause layer will be labeled like this:

  • If you are a whiskey lover = CONDITIONAL
  • you will know = INDEPENDENT
  • that it is a spirit = SUBJECT_OBJECT
  • produced from fermented grain = RELATIVE
  • and aged in the wood = RELATIVE

Clause tile details are:

  • Clause text
  • Clause ID, that is the clause position within the detected clause sequence
  • Clause type

Interpret phrase tiles

The labels of the tiles in the phrase layer (color: turquoise ) represent these types of phrases:

Label Type
AP Adjective phrase
CP Conjunction phrase
CR Blank lines
DP Adverb phrase
NP Noun phrase
PN Nominal predicate
PP Preposition phrase
RP Relative phrase
VP Verb phrase

Note

Tiles with no label indicate punctuation.

Example

Considering again the example sentence used above to illustrate clauses, tiles in the phrases layer will be labeled as follows:

  • If = CP
  • you = NP
  • are a whiskey lover = PN
  • , = Blank tile
  • you = NP
  • will know = VP
  • that = CP
  • it = NP
  • is a spirit = PN
  • produced = VP
  • from fermented grain = PP
  • and = CP
  • aged = VP
  • in the wood = PP
  • . = Blank tile

Phrase tile details are:

  • Phrase text.
  • Phrase type.

Interpret word tiles

The tiles in the words layer (color: tomato ) correspond to the largest possible spans of text having a distinct meaning.

A tile can correspond to a single word like consciousness—also hyphenated like high-density—, a multi-word collocation like Artificial Intelligence, or punctuation.

Some tiles directly correspond to Knowledge Graph syncons—registered concepts—while other correspond to concepts that are not present in the Knowledge Graph, but are recognized as specialization of ("type of") a registered concept, their so called virtual syncon.

Tile labels correspond to word classes (parts-of-speech).
When the underlying text is also recognized as a named entity (for example person name, a geographic place and so on), the entity type abbreviation is appended to the word class abbreviation, separated by a dot. For example, NPR.NPH is the label for a tile corresponding to a proper name that's also person name, like John Smith.

Please refer to the Studio languages reference for the list of of word classes and named entities.

Tile details vary based on the word class. For example, for this sentence:

Phillipe Starck has teamed up with a longstanding collaborator, Italian furniture company Kartell, to create the first chair to be designed using artificial intelligence, which is aptly named A.I.

named is labeled as VER (verb) and its details are:

Note

In case you find parent data, it refers to concepts that, in the Knowledge Graph supernomen-subnomen or superverbum-subverbum hierarchy, depending on the word class, are "the parent" of the tile concept.

Interpret atom tiles

Tiles in the atoms layer (color: salmon ) must be interpreted as those of the words layer and have the same functionalities, with the difference that they are single words, never collocations.

Interpret text elements tiles

Each tile in the text elements layer (color: white ) represents the portion of text corresponding to the atom tile above it.
Tiles in this layer have the same functionalities of those in the atom layer.

Common functionalities of lowest layers tiles

Tiles in the words, atoms and text elements layers have the same functionalities.

Tile details are:

  • Item text
  • Base form (lemma)
  • Word class and morphology info like gender and number

In the details panel:

  • To copy the word class abbreviation, right-click any row and choose Copy Type.
  • To copy the base form, right-click any row and choose Copy Base Form.
  • To search for the lemma—as search criteria—in the Knowledge Graph, right-click any row and choose Find lemma in Knowledge Graph.

Tiles corresponding to a concept

The tiles in the word, atom and text elements layers that represent a Knowledge Graph concept have more details and functionalities.

Determine the syncon ID

To display the Knowledge Graph's syncon ID, especially useful if Toggle Syncons IDs is off:

  • Right-click the tile. The syncon ID appears in the title of the context menu.

Or:

  • Select the tile. The syncon ID appears among the details on the right.

Copy the syncon ID

To copy the syncon ID:

  • Right-click the tile and choose Copy ID.

Or:

  • Select the tile.
  • In the details panel on the right, right-click any row and choose Copy Syncon ID.

Lookup a syncon in the Knowledge Graph

To search for a syncon in the Knowledge Graph:

  • Double-click the tile.

Or:

  • Select the tile.
  • In the details panel on the right, right-click any row and choose View Syncon in Knowledge Graph.

Or:

  • Select the tile.
  • In the details panel, double-click the row with the syncon ID.

Recognize and toggle tags

The tags (color: pink ) that were superimposed—either by tagging rules or by the script—to the deep linguistic analysis output are recognized by.

  • A tag tile with the tag name.
  • A red sign beside the tagged word(s) and atom(s).

The tag tile can have different icons beside it:

  • A gear in case of tagging by script.

Or:

  • A black asterisk in case of multiple tags assigned to a token.

  • A red asterisk in case of multiple tags assigned to a token, but at least one of them has been deleted.

    Note

    The deleted tag and its textual value—represented by TagEntry—are highlighted in the detail panel on the right of the Semantic Analysis tool window.

Tags are shown by default.

To decide if and which tags to display, choose from the Tagger drop-down menu on the toolbar.

Note

  • In case of multiple tags on a token, the tag with the highest level will appear in the analysis area on the left.
  • In case of a token sharing more tags with the same level, a single tag on the token will appear with the asterisk icon beside its name.
  • In all cases, you will see all your tags in the detail area on the right.

Tile details are, in the right panel:

  • Text form
  • Base form
  • Grammatical information
  • Tag syncon and parent syncons
  • Tag name

To copy a tag name to the clipboard, in the details panel, right-click and select TAG.

Example

If two different syncons are forced on the same token by two different tags defined in this way:

TAGS
{
    @tagLabel_1:tagSynconID_1,
    @tagLabel_2:tagSynconID_2
}

and two different tagging rules, both syncons will appear and the token will be linked to each one of them in the detail panel on the right.

Consider this example:

You can see:

  • (1) The syncon recognized by the disambiguator.
  • (2) The syncon you chose for the first tag.
  • (3) The syncon you chose for the second tag.

For each tag, there is the corresponding value represented by TagEntry. See the languages reference for more information about TagEntry.

Even though there are more tag tiles for a sentence in the Semantic Analysis tool window, these tiles may also correspond to a single tag and not to multiple ones. For a comparison, consider this image:

In this case, each color has been individually tagged. Evidence is in the Tagger tool window, because there are seven different tags with the corresponding values corresponding to each TagEntry available in the detail panel on the right of the Semantic Analysis tool window.

In the Tag Details panel you have the character extension of each tag with the Begin and End columns.

In this other example, where the tagging is given by composition used in tagging rules:

even though there are more tiles, there is only a single tag in the Tagger tool window with a character extension spanning from product to developers corresponding to a range of characters from 4 to 22.

Note

The tag value corresponds to the TagEntry available in the right panel of the Semantic Analysis tool window.