Extracting Epistatic Interactions in Type 2 Diabetes Genome-Wide Data Using Stacked Autoencoder

28 Aug 2018  ·  Basma Abdulaimma, Paul Fergus, Carl Chalmers ·

2 Diabetes is a leading worldwide public health concern, and its increasing prevalence has significant health and economic importance in all nations. The condition is a multifactorial disorder with a complex aetiology. The genetic determinants remain largely elusive, with only a handful of identified candidate genes. Genome wide association studies (GWAS) promised to significantly enhance our understanding of genetic based determinants of common complex diseases. To date, 83 single nucleotide polymorphisms (SNPs) for type 2 diabetes have been identified using GWAS. Standard statistical tests for single and multi-locus analysis such as logistic regression, have demonstrated little effect in understanding the genetic architecture of complex human diseases. Logistic regression is modelled to capture linear interactions but neglects the non-linear epistatic interactions present within genetic data. There is an urgent need to detect epistatic interactions in complex diseases as this may explain the remaining missing heritability in such diseases. In this paper, we present a novel framework based on deep learning algorithms that deal with non-linear epistatic interactions that exist in genome wide association data. Logistic association analysis under an additive genetic model, adjusted for genomic control inflation factor, is conducted to remove statistically improbable SNPs to minimize computational overheads.

PDF Abstract
No code implementations yet. Submit your code now

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here