Skip to content

Banking Email Categorization Knowledge Model

Overview

The Banking Email Categorization Knowledge Model (display name: Banking Email Categorization EN v#), aims at classifying customer support email messages in order to help banks sort and route this kind of communication.

The model predicts categories covering the most common customer support topics in the banking domain (see below).

Note

The model accepts plain English text as its input. You can then use the TikaTesseract Converter processor or external software to get the text out of Outlook MSG files or other message export file formats.

It is useful to include message headers with their labels at the beginning of the message text.
For example:

    From: Adams Amy [mailto:]
    Sent: sunday 6 aug 2019 14:37
    To: Smith, J. (John) <[email protected]>
    Subject: credit card PIN reissue
    Importance: High
    Hello,
    I have ordered a replacement card, however I haven't received the PIN yet.
    How do I request the reissue of the PIN?
    Thank you.
    Best regards
    Amy

Category tree

Possible categories are:

00. (General)
    00.01. Branch opening hours
    00.02. Contact an agent
01. Digital banking
    01.01. User – add, create
        01.01.02. User - modify
        01.01.03. User - remove
        01.01.04. Log-in - generic
        01.01.05. Log-in - first
        01.01.06. Pin - generic
        01.01.07. Pin - unlock
        01.01.08. Pin - reissue
        01.01.09. Password
        01.01.10. Certificate - (re)issue
        01.01.11. Certificate - install
        01.01.12. Other authentication
        01.01.13. Client ID
        01.01.14. Browser, OS, app issues
02. Bank account
    02.01. Open
    02.02. Modify
    02.03. Close
    02.04. Authorized signatory - add, modify, remove
    02.05. Balance
    02.06. Movements
    02.07. Audit letter request
    02.08. Bank fees
03. Payments
    03.01. Bank transfer
        03.01.01. Domestic, SEPA - status
        03.01.02. Domestic, SEPA - cancellation
        03.01.03. Domestic, SEPA - inquiry
        03.01.04. International - status
        03.01.05. International - cancellation
        03.01.06. International - inquiry
        03.01.07. Proof of payment request
    03.02. Online payments
    03.03. Digital wallet
    03.04. Checks
    03.05. Direct debit
04. Loans
    04.01. Loan - eligibility
    04.02. Loan - amount
    04.03. Mortgage - eligibility
    04.04. Mortgage - amount
    04.05. Installments
    04.06. Interest rates
05. Investing and trading
    05.01. Online investing
        05.01.01. Trading issues
    05.02. Investment services
06. Cards
    06.01. Activation
    06.02. Pin - generic
    06.03. Pin - unlock
    06.04. Pin - reissue
    06.05. Contactless payments
    06.06. Credit cards
    06.07. Debit cards
    06.08. Lost or stolen card
    06.09. Expiration
    06.10. Other issues
99. Uncategorized

The categories are divided into seven groups plus a category for unrecognized topics.

Output structure

The model output has the same structure as any other model and is affected by the functional properties of the workflow block.
The peculiar part of the output is the result of categorization, i.e. the categories array.

Example

Considering the following input text:

From: Adams Amy [mailto:]
Sent: sunday 6 aug 2019 14:37
To: Smith, J. (John) <[email protected]>
Subject: credit card PIN reissue
Importance: High
Hello,
I have ordered a replacement card, however I haven't received the PIN yet.
How do I request the reissue of the PIN?
Thank you.
Best regards
Amy

the results of categorization are:

"categories": [
    {
        "frequency": 71.15,
        "hierarchy": [
            "Cards",
            "Pin - reissue"
        ],
        "id": "06.04.",
        "label": "Pin - reissue",
        "namespace": "email_categorization_en",
        "positions": [
            {
                "end": 105,
                "start": 98
            },
            {
                "end": 118,
                "start": 107
            },
            {
                "end": 122,
                "start": 119
            },
            {
                "end": 130,
                "start": 123
            },
            {
                "end": 188,
                "start": 184
            },
            {
                "end": 207,
                "start": 200
            },
            {
                "end": 216,
                "start": 208
            },
            {
                "end": 224,
                "start": 221
            },
            {
                "end": 258,
                "start": 251
            },
            {
                "end": 269,
                "start": 266
            }
        ],
        "score": 37,
        "winner": true
    }
]