Banking Email Categorization Knowledge Model
Overview
The Banking Email Categorization Knowledge Model (display name: Banking Email Categorization EN v#), aims at classifying customer support email messages in order to help banks sort and route this kind of communication.
The model predicts categories covering the most common customer support topics in the banking domain (see below).
Note
The model accepts plain English text as its input. You can then use the TikaTesseract Converter processor or external software to get the text out of Outlook MSG files or other message export file formats.
It is useful to include message headers with their labels at the beginning of the message text.
For example:
From: Adams Amy [mailto:]
Sent: sunday 6 aug 2019 14:37
To: Smith, J. (John) <[email protected]>
Subject: credit card PIN reissue
Importance: High
Hello,
I have ordered a replacement card, however I haven't received the PIN yet.
How do I request the reissue of the PIN?
Thank you.
Best regards
Amy
Category tree
Possible categories are:
00. (General)
00.01. Branch opening hours
00.02. Contact an agent
01. Digital banking
01.01. User – add, create
01.01.02. User - modify
01.01.03. User - remove
01.01.04. Log-in - generic
01.01.05. Log-in - first
01.01.06. Pin - generic
01.01.07. Pin - unlock
01.01.08. Pin - reissue
01.01.09. Password
01.01.10. Certificate - (re)issue
01.01.11. Certificate - install
01.01.12. Other authentication
01.01.13. Client ID
01.01.14. Browser, OS, app issues
02. Bank account
02.01. Open
02.02. Modify
02.03. Close
02.04. Authorized signatory - add, modify, remove
02.05. Balance
02.06. Movements
02.07. Audit letter request
02.08. Bank fees
03. Payments
03.01. Bank transfer
03.01.01. Domestic, SEPA - status
03.01.02. Domestic, SEPA - cancellation
03.01.03. Domestic, SEPA - inquiry
03.01.04. International - status
03.01.05. International - cancellation
03.01.06. International - inquiry
03.01.07. Proof of payment request
03.02. Online payments
03.03. Digital wallet
03.04. Checks
03.05. Direct debit
04. Loans
04.01. Loan - eligibility
04.02. Loan - amount
04.03. Mortgage - eligibility
04.04. Mortgage - amount
04.05. Installments
04.06. Interest rates
05. Investing and trading
05.01. Online investing
05.01.01. Trading issues
05.02. Investment services
06. Cards
06.01. Activation
06.02. Pin - generic
06.03. Pin - unlock
06.04. Pin - reissue
06.05. Contactless payments
06.06. Credit cards
06.07. Debit cards
06.08. Lost or stolen card
06.09. Expiration
06.10. Other issues
99. Uncategorized
The categories are divided into seven groups plus a category for unrecognized topics.
Output structure
The model output has the same structure as any other model and is affected by the functional properties of the workflow block.
The peculiar part of the output is the result of categorization, i.e. the categories
array.
Example
Considering the following input text:
From: Adams Amy [mailto:]
Sent: sunday 6 aug 2019 14:37
To: Smith, J. (John) <[email protected]>
Subject: credit card PIN reissue
Importance: High
Hello,
I have ordered a replacement card, however I haven't received the PIN yet.
How do I request the reissue of the PIN?
Thank you.
Best regards
Amy
the results of categorization are:
"categories": [
{
"frequency": 71.15,
"hierarchy": [
"Cards",
"Pin - reissue"
],
"id": "06.04.",
"label": "Pin - reissue",
"namespace": "email_categorization_en",
"positions": [
{
"end": 105,
"start": 98
},
{
"end": 118,
"start": 107
},
{
"end": 122,
"start": 119
},
{
"end": 130,
"start": 123
},
{
"end": 188,
"start": 184
},
{
"end": 207,
"start": 200
},
{
"end": 216,
"start": 208
},
{
"end": 224,
"start": 221
},
{
"end": 258,
"start": 251
},
{
"end": 269,
"start": 266
}
],
"score": 37,
"winner": true
}
]