Biomedical NER using Novel Schema and Distant Supervision
Biomedical Named Entity Recognition (BMNER) is one of the most important tasks in the field of biomedical text mining. Most work so far on this task has not focused on identification of discontinuous and overlapping entities, even though they are present in significant fractions in real-life biomedical datasets. In this paper, we introduce a novel annotation schema to capture complex entities, and explore the effects of distant supervision on our deep-learning sequence labelling model. For BMNER task, our annotation schema outperforms other BIO-based annotation schemes on the same model. We also achieve higher F1-scores than state-of-the-art models on multiple corpora without fine-tuning embeddings, highlighting the efficacy of neural feature extraction using our model.
PDF AbstractDatasets
Results from the Paper
Ranked #1 on Medical Named Entity Recognition on ShARe/CLEF 2014 Task 2 Disorders (using extra training data)
Task | Dataset | Model | Metric Name | Metric Value | Global Rank | Uses Extra Training Data |
Benchmark |
---|---|---|---|---|---|---|---|
Medical Named Entity Recognition | ShARe/CLEF 2014 Task 2 Disorders | Distant Supervision with BIODT Tagging | F1 | 0.807 | # 1 | ||
Medical Named Entity Recognition | ShARe/CLEF eHealth corpus | Distant Supervision with BIODT Tagging | F1 | 0.799 | # 2 |