Sentiment Analysis French Knowledge Model
The Sentiment Analysis French Knowledge Model (display name: Sentiment Analysis French v#) aims at categorizing documents written in French according to the sentiment and emotion analysis.
The model also also extracts potentially positive or negative terms, facts and events in generic news and financial articles.
Categorization
The model taxonomy (below) includes a positive and a negative cluster (émotions positives and émotions negatives).
1000 émotions
1100 émotions négatives
1110 rage
1120 inquiétude
1130 détresse
1140 contrariété
1150 surprise
1200 émotions positives
1210 joie
1220 affection
1230 satisfaction
1240 sérénité
1250 surprise
A surprise (category surprise) can be positive or negative, that's why the corresponding category is included in both clusters.
Extraction groups and classes
SENTIMENT_SCORE
The SENTIMENT_SCORE group includes sentiment polarity and score.
Class | Description |
---|---|
polarite | Sentiment polarity, possible values are positive, negative and neutre. |
score | Sentiment score. |
Polarity and score are derived from categorization scores: the cumulative score of categories under the émotions négatives cluster is subtracted from the cumulative score of categories under the émotions positives cluster. Polarity is neutral (polarite equal to neutre) when the score is between 0 and 2.
SENTIMENT_POS_INFO
The SENTIMENT_POS_INFO group extracts information about the elements of the text that contributed to positive sentiment.
Class | Description |
---|---|
facteur_semantique_pos | Term with positive connotation. |
facteur_declencheur_pos | Cause or object positive terms refer to. |
SENTIMENT_NEG_INFO
The SENTIMENT_NEG_INFO group extracts information about the elements of the text that contributed to negative sentiment.
Class | Description |
---|---|
facteur_semantique_neg | Term with negative connotation. |
facteur_declencheur_neg | Cause or object negative terms refer to. |
ENTITES
The ENTITIES group extracts references to named entities mentioned in the text, for example:
Les viennoiseries du Café de Flore sont une tuerie.
Class | Description |
---|---|
entite | Named entity. |
FAITS_DIVERSE
The FAITS_DIVERSE group extracts facts and events with a potential positive or negative connotation (for example: crise sanitaire).
Class | Description |
---|---|
potentielement_negatif | Potentially negative fact or event. |
potentielement_positif | Potentially positive fact or event. |
ECONOMIE_FINANCE
The ECONOMIE_FINANCE group extracts financial or economic terms with a potential positive or negative connotation (for example: reprise économique or récession).
Class | Description |
---|---|
potentielement_negatif | Potentially negative financial or economic term. |
potentielement_positif | Potentially positive financial or economic term. |
Output structure
The model output has the same structure as any other model and is affected by the functional properties of the workflow block.
The peculiar parts of the output are the result of categorization, that is the categories
array, and the result of information extraction, that is the extractions
array.
Example
Considering the input text:
Magnifique épopée, une belle histoire, touchante avec des acteurs qui interprètent très bien leur rôles (Mel Gibson, Heath Ledger, Jason Isaacs...), le genre de film qui se savoure en famille!
the categorization output is like the following:
"categories": [
{
"frequency": 33.33,
"hierarchy": [
"émotions positives"
],
"id": "1200",
"label": "émotions positives",
"namespace": "french_sentiment",
"positions": [],
"score": 22,
"winner": true
},
{
"frequency": 6.05,
"hierarchy": [
"émotions positives",
"joie"
],
"id": "1210",
"label": "joie",
"namespace": "french_sentiment",
"positions": [
{
"end": 180,
"start": 173
}
],
"score": 4,
"winner": true
},
{
"frequency": 6.05,
"hierarchy": [
"émotions positives",
"affection"
],
"id": "1220",
"label": "affection",
"namespace": "french_sentiment",
"positions": [
{
"end": 37,
"start": 29
},
{
"end": 48,
"start": 39
}
],
"score": 4,
"winner": true
},
{
"frequency": 21.2,
"hierarchy": [
"émotions positives",
"satisfaction"
],
"id": "1230",
"label": "satisfaction",
"namespace": "french_sentiment",
"positions": [
{
"end": 10,
"start": 0
},
{
"end": 17,
"start": 11
},
{
"end": 28,
"start": 23
},
{
"end": 37,
"start": 29
}
],
"score": 14,
"winner": true
}
]
and extraction is like this:
"extractions": [
{
"fields": [
{
"name": "facteur_declencheur_pos",
"positions": [
{
"end": 37,
"start": 29
}
],
"value": "histoire"
},
{
"name": "facteur_declencheur_pos",
"positions": [
{
"end": 17,
"start": 11
}
],
"value": "épopée"
},
{
"name": "facteur_semantique_pos",
"positions": [
{
"end": 28,
"start": 23
}
],
"value": "beau"
},
{
"name": "facteur_semantique_pos",
"positions": [
{
"end": 10,
"start": 0
}
],
"value": "magnifique"
},
{
"name": "facteur_semantique_pos",
"positions": [
{
"end": 48,
"start": 39
}
],
"value": "touchant"
},
{
"name": "facteur_semantique_pos",
"positions": [
{
"end": 180,
"start": 173
}
],
"value": "savourer"
}
],
"namespace": "french_sentiment",
"template": "SENTIMENT_POS_INFO"
},
{
"fields": [
{
"name": "score",
"positions": [],
"value": "22"
}
],
"namespace": "french_sentiment",
"template": "SENTIMENT_SCORE"
},
{
"fields": [
{
"name": "polarite",
"positions": [],
"value": "positive"
}
],
"namespace": "french_sentiment",
"template": "SENTIMENT_SCORE"
}
]
In this model's output, the template key corresponds to the concept of group and template fields correspond to classes.
You can see that the score (highlighted above) is the sum of categories' scores (22 = 4 + 4 + 14), all positive because all the predicted categories are under the émotions positives cluster.
Another example with a text that is about economy and finance:
Travail d'enfants, salariés sous-payés: une société mise en cause dans Cash Investigation se défend
Par Jean Blaquière
Publié le 19/06/2019 à 17:45, mis à jour le 20/06/2019 à 12:50
La journaliste Élise Lucet et le directeur des affaires internationales du groupe Limagrain Jean-Christophe Gouache, dans l'enquête de Cash Investigation intitulée «Multinationales: hold-up sur nos fruits et légumes», diffusée sur France 2 mardi 18 juin. Capture d'écran Cash Investigation
Dans un reportage diffusé mardi sur France 2, l'équipe d'Élise Lucet s'est attaquée aux multinationales semencières de fruits et légumes. Parmi elles, le français Limagrain se retrouve accusé de faire travailler des enfants et de sous-payer ses salariés en Inde. Le groupe a réagi le lendemain de la diffusion dans un communiqué.
Elle s'inscrit en faux contre Cash Investigation. La multinationale Limagrain, sévèrement attaquée dans un reportage de l'émission diffusé mardi soir sur France 2, dément les accusations qui lui sont portées, via un communiqué de presse publié ce mercredi. «NON, Limagrain ne fait pas travailler des enfants en Inde» et «NON, Limagrain ne sous-paie pas ses salariés en Inde», affirme le semencier. Qui a raison, qui a tort?
The categorization part of the output is:
"categories": [
{
"frequency": 33.33,
"hierarchy": [
"émotions négatives"
],
"id": "1100",
"label": "émotions négatives",
"namespace": "french_sentiment",
"positions": [],
"score": 24,
"winner": true
},
{
"frequency": 5.55,
"hierarchy": [
"émotions négatives",
"détresse"
],
"id": "1130",
"label": "détresse",
"namespace": "french_sentiment",
"positions": [
{
"end": 17,
"start": 0
},
{
"end": 89,
"start": 76
}
],
"score": 4,
"winner": true
},
{
"frequency": 27.77,
"hierarchy": [
"émotions négatives",
"contrariété"
],
"id": "1140",
"label": "contrariété",
"namespace": "french_sentiment",
"positions": [
{
"end": 65,
"start": 52
},
{
"end": 99,
"start": 93
},
{
"end": 831,
"start": 825
},
{
"end": 850,
"start": 832
}
],
"score": 20,
"winner": true
}
],
and the extraction part is:
"extractions": [
{
"fields": [
{
"name": "entite",
"positions": [
{
"end": 850,
"start": 832
},
{
"end": 471,
"start": 453
},
{
"end": 335,
"start": 317
},
{
"end": 89,
"start": 71
}
],
"value": "Cash Investigation"
},
{
"name": "entite",
"positions": [
{
"end": 964,
"start": 956
},
{
"end": 516,
"start": 508
},
{
"end": 421,
"start": 413
}
],
"value": "France 2"
},
{
"name": "entite",
"positions": [
{
"end": 273,
"start": 264
},
{
"end": 1074,
"start": 1065
},
{
"end": 1137,
"start": 1128
},
{
"end": 879,
"start": 870
},
{
"end": 644,
"start": 635
}
],
"value": "Limagrain"
},
{
"name": "entite",
"positions": [
{
"end": 397,
"start": 347
}
],
"value": "Multinationales: hold-up sur nos fruits et légumes"
}
],
"namespace": "french_sentiment",
"template": "ENTITES"
},
{
"fields": [
{
"name": "facteur_declencheur_neg",
"positions": [
{
"end": 850,
"start": 832
}
],
"value": "Cash Investigation"
},
{
"name": "facteur_declencheur_neg",
"positions": [
{
"end": 17,
"start": 0
}
],
"value": "travail d'enfants"
},
{
"name": "facteur_semantique_neg",
"positions": [
{
"end": 831,
"start": 825
}
],
"value": "contre"
},
{
"name": "facteur_semantique_neg",
"positions": [
{
"end": 99,
"start": 93
}
],
"value": "défendre"
},
{
"name": "facteur_semantique_neg",
"positions": [
{
"end": 89,
"start": 76
}
],
"value": "investigation"
},
{
"name": "facteur_semantique_neg",
"positions": [
{
"end": 65,
"start": 52
}
],
"value": "mettre en cause"
}
],
"namespace": "french_sentiment",
"template": "SENTIMENT_NEG_INFO"
},
{
"fields": [
{
"name": "score",
"positions": [],
"value": "-24"
}
],
"namespace": "french_sentiment",
"template": "SENTIMENT_SCORE"
},
{
"fields": [
{
"name": "polarite",
"positions": [],
"value": "negative"
}
],
"namespace": "french_sentiment",
"template": "SENTIMENT_SCORE"
}
],