Chemical-Induced Disease Detection Using Invariance-based Pattern Learning Model

WS 2017  ·  Neha Warikoo, Yung-Chun Chang, Wen-Lian Hsu ·

In this work, we introduce a novel feature engineering approach named {``}algebraic invariance{''} to identify discriminative patterns for learning relation pair features for the chemical-disease relation (CDR) task of BioCreative V. Our method exploits the existing structural similarity of the key concepts of relation descriptions from the CDR corpus to generate robust linguistic patterns for SVM tree kernel-based learning. Preprocessing of the training data classifies the entity pairs as either related or unrelated to build instance types for both inter-sentential and intra-sentential scenarios. An invariant function is proposed to process and optimally cluster similar patterns for both positive and negative instances. The learning model for CDR pairs is based on the SVM tree kernel approach, which generates feature trees and vectors and is modeled on suitable invariance based patterns, bringing brevity, precision and context to the identifier features. Results demonstrate that our method outperformed other compared approaches, achieved a high recall rate of 85.08{\%}, and averaged an F1-score of 54.34{\%} without the use of any additional knowledge bases.

PDF Abstract

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods