Skip to content

Banking Email Categorization Knowledge Model

Overview

The Banking Email Categorization Knowledge Model (display name: Banking Email Categorization EN v#), aims at classifying customer support email messages in order to help banks sort and route this kind of communication.

The model predicts categories covering the most common customer support topics in the banking domain (see below).

Note

The model accepts plain English text as its input. You can then use the Tika Converter processor or external software to get the text out of Outlook MSG files or other message export file formats.

It is useful to include message headers with their labels at the beginning of the message text.
For example:

    From: Adams Amy [mailto:]
    Sent: sunday 6 aug 2019 14:37
    To: Smith, J. (John) <[email protected]>
    Subject: credit card PIN reissue
    Importance: High
    Hello,
    I have ordered a replacement card, however I haven't received the PIN yet.
    How do I request the reissue of the PIN?
    Thank you.
    Best regards
    Amy

Category tree

Possible categories are:

00. (General)   
    00.01.  Branch opening hours    
    00.02.  Contact an agent    
01. Digital banking 
    01.01.  User – add, create  
        01.01.02.   User - modify   
        01.01.03.   User - remove   
        01.01.04.   Log-in - generic    
        01.01.05.   Log-in - first  
        01.01.06.   Pin - generic   
        01.01.07.   Pin - unlock    
        01.01.08.   Pin - reissue   
        01.01.09.   Password    
        01.01.10.   Certificate - (re)issue 
        01.01.11.   Certificate - install   
        01.01.12.   Other authentication    
        01.01.13.   Client ID   
        01.01.14.   Browser, OS, app issues 
02. Bank account    
    02.01.  Open    
    02.02.  Modify  
    02.03.  Close
    02.04.  Authorized signatory - add, modify, remove  
    02.05.  Balance 
    02.06.  Movements   
    02.07.  Audit letter request    
    02.08.  Bank fees
03. Payments    
    03.01.  Bank transfer   
        03.01.01.   Domestic, SEPA - status 
        03.01.02.   Domestic, SEPA - cancellation   
        03.01.03.   Domestic, SEPA - inquiry    
        03.01.04.   International - status  
        03.01.05.   International - cancellation    
        03.01.06.   International - inquiry 
        03.01.07.   Proof of payment request    
    03.02.  Online payments 
    03.03.  Digital wallet  
    03.04.  Checks  
    03.05.  Direct debit
04. Loans   
    04.01.  Loan - eligibility  
    04.02.  Loan - amount   
    04.03.  Mortgage - eligibility  
    04.04.  Mortgage - amount   
    04.05.  Installments    
    04.06.  Interest rates  
05. Investing and trading
    05.01.  Online investing    
        05.01.01.   Trading issues  
    05.02.  Investment services 
06. Cards   
    06.01.  Activation  
    06.02.  Pin - generic   
    06.03.  Pin - unlock    
    06.04.  Pin - reissue   
    06.05.  Contactless payments    
    06.06.  Credit cards    
    06.07.  Debit cards 
    06.08.  Lost or stolen card 
    06.09.  Expiration  
    06.10.  Other issues    
99. Uncategorized

The categories are divided into seven groups plus a category for unrecognized topics.

Output structure

The model output has the same structure as any other model and is affected by the functional options of the workflow block.
The peculiar part of the output is the result of categorization, i.e. the categories array.

Example

Considering the following input text:

From: Adams Amy [mailto:]
Sent: sunday 6 aug 2019 14:37
To: Smith, J. (John) <[email protected]>
Subject: credit card PIN reissue
Importance: High
Hello,
I have ordered a replacement card, however I haven't received the PIN yet.
How do I request the reissue of the PIN?
Thank you.
Best regards
Amy

the results of categorization are:

"categories": [
    {
        "frequency": 71.15,
        "hierarchy": [
            "Cards",
            "Pin - reissue"
        ],
        "id": "06.04.",
        "label": "Pin - reissue",
        "namespace": "email_categorization_en",
        "positions": [
            {
                "end": 105,
                "start": 98
            },
            {
                "end": 118,
                "start": 107
            },
            {
                "end": 122,
                "start": 119
            },
            {
                "end": 130,
                "start": 123
            },
            {
                "end": 188,
                "start": 184
            },
            {
                "end": 207,
                "start": 200
            },
            {
                "end": 216,
                "start": 208
            },
            {
                "end": 224,
                "start": 221
            },
            {
                "end": 258,
                "start": 251
            },
            {
                "end": 269,
                "start": 266
            }
        ],
        "score": 37,
        "winner": true
    }
]