Learning end-to-end patient representations through self-supervised covariate balancing for causal treatment effect estimation

A causal effect can be defined as a comparison of outcomes that result from two or more alternative actions, with only one of the action-outcome pairs actually being observed. In healthcare, the gold standard for causal effect measurements is randomized controlled trials (RCTs), in which a target population is explicitly defined and each study sample is randomly assigned to either the treatment or control cohorts. The great potential to derive actionable insights from causal relationships has led to a growing body of machine-learning research applying causal effect estimators to observational data in the fields of healthcare, education, and economics. The primary difference between causal effect studies utilizing observational data and RCTs is that for observational data, the study occurs after the treatment, and therefore we do not have control over the treatment assignment mechanism. This can lead to massive differences in covariate distributions between control and treatment samples, making a comparison of causal effects confounded and unreliable. Classical approaches have sought to solve this problem piecemeal, first by predicting treatment assignment and then treatment effect separately. Recent work extended part of these approaches to a new family of representation-learning algorithms, showing that the upper bound of the expected treatment effect estimation error is determined by two factors: the outcome generalization-error of the representation and the distance between the treated and control distributions induced by the representation. To achieve minimal dissimilarity in learning such distributions, in this work we propose a specific auto-balancing, self-supervised objective. Experiments on real and benchmark datasets revealed that our approach consistently produced less biased estimates than previously published state-of-the-art methods. We demonstrate that the reduction in error can be directly attributed to the ability to learn representations that explicitly reduce such dissimilarity; further, in case of violations of the positivity assumption (frequent in observational data), we show our approach performs significantly better than the previous state of the art. Thus, by learning representations that induce similar distributions of the treated and control cohorts, we present evidence to support the error bound dissimilarity hypothesis as well as providing a new state-of-the-art model for causal effect estimation.

PDF Abstract

Datasets


Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Benchmark
Causal Inference IHDP BCAUSS Average Treatment Effect Error 0.15 # 1
Causal Inference Jobs BCAUSS Average Treatment Effect on the Treated Error 0.05 # 1

Methods


No methods listed for this paper. Add relevant methods here