Skip to content

Life Sciences - Medical Knowledge Model

Overview

The Life Sciences - Medical Knowledge Model (display name: Life Sciences - Medical EN v#) is an extraction model that predicts various types of biomedical entities out of English texts.
Exracted entities correspond to entries of Unified Medical Language System (UMLS) vocabularies.

More specifically, the model can extract entities belonging to three of the most common classes in the UMLS taxonomy:

  • Drugs
  • Diseases
  • Signs or symptoms

The model has been designed for a specific text type, that is, scientific papers and academic articles like those found on PubMed.

Covered vocabularies

UMLS is a resource that gathers more than 200 vocabularies in the health and biomedical sciences, and that integrates and distributes key terminology and coding standards in order to promote and enable interoperability between computer systems and services.

UMLS biggest component is the Metathesaurus through which all the concepts are interconnected and linked to similar concepts and terminologies inside the vocabularies.

The concepts gathered in UMLS are organized by semantic types or semantic groups linked to each semantic type.

The Life Sciences - Medical Knowledge Model covers these UMLS vocabularies:

  • Medical Subject Headings (MeSH)
  • International Classification of Disease, Tenth Revision, Clinical Modification (ICD-10-CM)
  • International Classification of Disease, Ninth Revision, Clinical Modification (ICD-9-CM)
  • Metathesaurus Additional Entry Terms for ICD-9-CM (MTHICD9)
  • SNOMED CT United States Edition
  • NCI Dictionary of Cancer Terms
  • CHV (Consumer Health Vocabulary)
  • LNC (Logical Observation Identifiers Names and Codes terminology LOINC®)
  • OMIM - Online Mendelian Inheritance in Man
  • ICPC2P (International Classification of Primary Care - 2 PLUS)

Extraction groups and classes

UMLS semantic groups:

  • Drugs
  • Diseases
  • Signs or symptoms

are mapped to extraction groups DRUGS, DISEASES and SIGNORSYMPTOMS.

For example, in the UMLS these are considered drugs:

  • Antibiotics
  • Vitamins
  • Pharmacological substances and many others

So they will be predicted as classes of the DRUGS group.

In addition to the types belonging to the drug semantic group, the model extracts mentions of substances that are classified as mechanism of action and that belong to the Mecanismofaction semantic class and Action semantic group. This group comprises substances falling into the agonist, antagonist, inhibitor, blocker and activator categories.
Thus, model is able to extract both the drug’s trade name (for example Trulicity), its molecule name (Dulaglutide) and the name of its mechanism of action (glucagon-like peptide-1/GLP-1 receptor agonist).

Each model's extraction group has one class:

Group Class
DRUGS DRUG
DISEASES DISEASE
SIGNORSYMPTOMS SIGNORSYMPTOM

For example, given this text:

A 42-year-old Hispanic woman, with end-stage renal disease, anemia, hypertension, and a history of an anaphylactic reaction to basiliximab, was scheduled to receive a living donor transplant and received basiliximab uneventfully. 
Dulaglutide was generally well tolerated, with a low inherent risk of hypoglycemia. The most frequently reported adverse events in clinical trials were gastrointestinal-related (for example nausea, vomiting, and diarrhea).

extractions are:

Group Classs Class value
DISEASES DISEASE Chronic kidney disease stage 5
DISEASES DISEASE anaemia
DISEASES DISEASE hypertension
DISEASES DISEASE Anaphylaxis
DISEASES DISEASE hypoglycemia
DRUGS DRUG basiliximab
DRUGS DRUG basiliximab
DRUGS DRUG dulaglutide
SIGNORSYMPTOMS SIGNORSYMPTOM nausea
SIGNORSYMPTOMS SIGNORSYMPTOM vomit
SIGNORSYMPTOMS SIGNORSYMPTOM diarrhoea

Output structure

The model output has the same structure as any other model and is affected by the functional properties of the workflow block.
The peculiar part of the output is the result of information extraction, i.e. the extractions array.

Example

In this model's output, the template key corresponds to the concept of group and template fields correspond to classes.
Considering the text for the above example, the extraction output is:

"extractions": [
    {
        "fields": [
            {
                "name": "DISEASE",
                "positions": [
                    {
                        "end": 58,
                        "start": 35
                    }
                ],
                "value": "Chronic kidney disease stage 5"
            }
        ],
        "namespace": "lifescience_med_en",
        "template": "DISEASES"
    },
    {
        "fields": [
            {
                "name": "DISEASE",
                "positions": [
                    {
                        "end": 123,
                        "start": 102
                    }
                ],
                "value": "anaphylaxis"
            }
        ],
        "namespace": "lifescience_med_en",
        "template": "DISEASES"
    },
    {
        "fields": [
            {
                "name": "DISEASE",
                "positions": [
                    {
                        "end": 80,
                        "start": 68
                    }
                ],
                "value": "hypertension"
            }
        ],
        "namespace": "lifescience_med_en",
        "template": "DISEASES"
    },
    {
        "fields": [
            {
                "name": "DISEASE",
                "positions": [
                    {
                        "end": 66,
                        "start": 60
                    }
                ],
                "value": "anaemia"
            }
        ],
        "namespace": "lifescience_med_en",
        "template": "DISEASES"
    },
    {
        "fields": [
            {
                "name": "DISEASE",
                "positions": [
                    {
                        "end": 312,
                        "start": 300
                    }
                ],
                "value": "hypoglycemia"
            }
        ],
        "namespace": "lifescience_med_en",
        "template": "DISEASES"
    },
    {
        "fields": [
            {
                "name": "DRUG",
                "positions": [
                    {
                        "end": 138,
                        "start": 127
                    },
                    {
                        "end": 215,
                        "start": 204
                    }
                ],
                "value": "basiliximab"
            }
        ],
        "namespace": "lifescience_med_en",
        "template": "DRUGS"
    },
    {
        "fields": [
            {
                "name": "DRUG",
                "positions": [
                    {
                        "end": 241,
                        "start": 230
                    }
                ],
                "value": "dulaglutide"
            }
        ],
        "namespace": "lifescience_med_en",
        "template": "DRUGS"
    },
    {
        "fields": [
            {
                "name": "SIGNORSYMPTOM",
                "positions": [
                    {
                        "end": 426,
                        "start": 420
                    }
                ],
                "value": "nausea"
            }
        ],
        "namespace": "lifescience_med_en",
        "template": "SIGNSORSYMPTOMS"
    },
    {
        "fields": [
            {
                "name": "SIGNORSYMPTOM",
                "positions": [
                    {
                        "end": 436,
                        "start": 428
                    }
                ],
                "value": "vomit"
            }
        ],
        "namespace": "lifescience_med_en",
        "template": "SIGNSORSYMPTOMS"
    },
    {
        "fields": [
            {
                "name": "SIGNORSYMPTOM",
                "positions": [
                    {
                        "end": 450,
                        "start": 442
                    }
                ],
                "value": "diarrhoea"
            }
        ],
        "namespace": "lifescience_med_en",
        "template": "SIGNSORSYMPTOMS"
    }
]