Search Results for author: Chengchun Shi

Found 36 papers, 18 papers with code

An Analysis of Switchback Designs in Reinforcement Learning

no code implementations26 Mar 2024 Qianglin Wen, Chengchun Shi, Ying Yang, Niansheng Tang, Hongtu Zhu

Our aim is to thoroughly evaluate the effects of these designs on the accuracy of their resulting average treatment effect (ATE) estimators.

reinforcement-learning

Pessimistic Causal Reinforcement Learning with Mediators for Confounded Offline Data

no code implementations18 Mar 2024 Danyang Wang, Chengchun Shi, Shikai Luo, Will Wei Sun

As a result, leveraging large observational datasets becomes a more attractive option for achieving high-quality policy learning.

reinforcement-learning Reinforcement Learning (RL) +1

Robust Offline Reinforcement learning with Heavy-Tailed Rewards

1 code implementation28 Oct 2023 Jin Zhu, Runzhe Wan, Zhengling Qi, Shikai Luo, Chengchun Shi

This paper endeavors to augment the robustness of offline reinforcement learning (RL) in scenarios laden with heavy-tailed rewards, a prevalent circumstance in real-world applications.

Offline RL Off-policy evaluation +1

Off-policy Evaluation in Doubly Inhomogeneous Environments

no code implementations14 Jun 2023 Zeyu Bian, Chengchun Shi, Zhengling Qi, Lan Wang

This work aims to study off-policy evaluation (OPE) under scenarios where two key reinforcement learning (RL) assumptions -- temporal stationarity and individual homogeneity are both violated.

Offline RL Off-policy evaluation

Testing for the Markov Property in Time Series via Deep Conditional Generative Learning

1 code implementation30 May 2023 Yunzhe Zhou, Chengchun Shi, Lexin Li, Qiwei Yao

In this article, we propose a nonparametric test for the Markov property in high-dimensional time series via deep conditional generative learning.

Time Series

Evaluating Dynamic Conditional Quantile Treatment Effects with Applications in Ridesharing

no code implementations17 May 2023 Ting Li, Chengchun Shi, Zhaohua Lu, Yi Li, Hongtu Zhu

However, assessing dynamic quantile treatment effects (QTE) remains a challenge, particularly when dealing with data from ride-sourcing platforms that involve sequential decision-making across time and space.

Decision Making

Sequential Knockoffs for Variable Selection in Reinforcement Learning

no code implementations24 Mar 2023 Tao Ma, Hengrui Cai, Zhengling Qi, Chengchun Shi, Eric B. Laber

In real-world applications of reinforcement learning, it is often challenging to obtain a state representation that is parsimonious and satisfies the Markov property without prior knowledge.

reinforcement-learning Variable Selection

A Reinforcement Learning Framework for Dynamic Mediation Analysis

1 code implementation31 Jan 2023 Lin Ge, Jitao Wang, Chengchun Shi, Zhenke Wu, Rui Song

However, there are a number of applications (e. g., mobile health) where the treatments are sequentially assigned over time and the dynamic mediation effects are of primary interest.

reinforcement-learning Reinforcement Learning (RL)

Value Enhancement of Reinforcement Learning via Efficient and Robust Trust Region Optimization

no code implementations5 Jan 2023 Chengchun Shi, Zhengling Qi, Jianing Wang, Fan Zhou

When the initial policy is consistent, under some mild conditions, our method will yield a policy whose value converges to the optimal one at a faster rate than the initial policy, achieving the desired ``value enhancement" property.

Decision Making reinforcement-learning +1

Deep Spectral Q-learning with Application to Mobile Health

no code implementations3 Jan 2023 Yuhe Gao, Chengchun Shi, Rui Song

Dynamic treatment regimes assign personalized treatments to patients sequentially over time based on their baseline information and time-varying covariates.

Q-Learning

Quantile Off-Policy Evaluation via Deep Conditional Generative Learning

no code implementations29 Dec 2022 Yang Xu, Chengchun Shi, Shikai Luo, Lan Wang, Rui Song

Off-Policy evaluation (OPE) is concerned with evaluating a new target policy using offline data generated by a potentially different behavior policy.

Decision Making Off-policy evaluation

An Instrumental Variable Approach to Confounded Off-Policy Evaluation

no code implementations29 Dec 2022 Yang Xu, Jin Zhu, Chengchun Shi, Shikai Luo, Rui Song

Off-policy evaluation (OPE) is a method for estimating the return of a target policy using some pre-collected observational data generated by a potentially different behavior policy.

Decision Making Off-policy evaluation

A Review of Off-Policy Evaluation in Reinforcement Learning

no code implementations13 Dec 2022 Masatoshi Uehara, Chengchun Shi, Nathan Kallus

Reinforcement learning (RL) is one of the most vibrant research frontiers in machine learning and has been recently applied to solve a number of challenging problems.

Off-policy evaluation reinforcement-learning

Doubly Inhomogeneous Reinforcement Learning

1 code implementation8 Nov 2022 Liyuan Hu, Mengbing Li, Chengchun Shi, Zhenke Wu, Piotr Fryzlewicz

Moreover, by borrowing information over time and population, it allows us to detect weaker signals and has better convergence properties when compared to applying the clustering algorithm per time or the change point detection algorithm per subject.

Change Point Detection Clustering +3

Optimizing Pessimism in Dynamic Treatment Regimes: A Bayesian Learning Approach

1 code implementation26 Oct 2022 Yunzhe Zhou, Zhengling Qi, Chengchun Shi, Lexin Li

In this article, we propose a novel pessimism-based Bayesian learning method for optimal dynamic treatment regimes in the offline setting.

Thompson Sampling Variational Inference

Blessing from Human-AI Interaction: Super Reinforcement Learning in Confounded Environments

no code implementations29 Sep 2022 Jiayi Wang, Zhengling Qi, Chengchun Shi

This approach utilizes the observed action, either from AI or humans, as input for achieving a stronger oracle in policy learning for the decision maker (humans or AI).

Decision Making reinforcement-learning +1

Semi-supervised Batch Learning From Logged Data

no code implementations15 Sep 2022 Gholamali Aminian, Armin Behnamnia, Roberto Vega, Laura Toni, Chengchun Shi, Hamid R. Rabiee, Omar Rivasplata, Miguel R. D. Rodrigues

We propose learning methods for problems where feedback is missing for some samples, so there are samples with feedback and samples missing-feedback in the logged data.

counterfactual

Future-Dependent Value-Based Off-Policy Evaluation in POMDPs

1 code implementation NeurIPS 2023 Masatoshi Uehara, Haruka Kiyohara, Andrew Bennett, Victor Chernozhukov, Nan Jiang, Nathan Kallus, Chengchun Shi, Wen Sun

Finally, we extend our methods to learning of dynamics and establish the connection between our approach and the well-known spectral learning methods in POMDPs.

Off-policy evaluation

Conformal Off-policy Prediction

1 code implementation14 Jun 2022 Yingying Zhang, Chengchun Shi, Shikai Luo

Off-policy evaluation is critical in a number of applications where new policies need to be evaluated offline before online deployment.

Conformal Prediction Off-policy evaluation +2

Testing Stationarity and Change Point Detection in Reinforcement Learning

1 code implementation3 Mar 2022 Mengbing Li, Chengchun Shi, Zhenke Wu, Piotr Fryzlewicz

Based on the proposed test, we further develop a sequential change point detection method that can be naturally coupled with existing state-of-the-art RL methods for policy optimization in nonstationary environments.

Change Point Detection reinforcement-learning +1

Statistically Efficient Advantage Learning for Offline Reinforcement Learning in Infinite Horizons

1 code implementation26 Feb 2022 Chengchun Shi, Shikai Luo, Yuan Le, Hongtu Zhu, Rui Song

We consider reinforcement learning (RL) methods in offline domains without additional online data collection, such as mobile health applications.

reinforcement-learning Reinforcement Learning (RL)

Off-Policy Confidence Interval Estimation with Confounded Markov Decision Process

1 code implementation22 Feb 2022 Chengchun Shi, Jin Zhu, Ye Shen, Shikai Luo, Hongtu Zhu, Rui Song

In this paper, we show that with some auxiliary variables that mediate the effect of actions on the system dynamics, the target policy's value is identifiable in a confounded Markov decision process.

Uncertainty Quantification

Policy Evaluation for Temporal and/or Spatial Dependent Experiments

no code implementations22 Feb 2022 Shikai Luo, Ying Yang, Chengchun Shi, Fang Yao, Jieping Ye, Hongtu Zhu

The aim of this paper is to establish a causal link between the policies implemented by technology companies and the outcomes they yield within intricate temporal and/or spatial dependent experiments.

Marketing

A Multi-Agent Reinforcement Learning Framework for Off-Policy Evaluation in Two-sided Markets

1 code implementation21 Feb 2022 Chengchun Shi, Runzhe Wan, Ge Song, Shikai Luo, Rui Song, Hongtu Zhu

In this paper we consider large-scale fleet management in ride-sharing companies that involve multiple units in different areas receiving sequences of products (or treatments) over time.

Management Multi-agent Reinforcement Learning +1

Jump Interval-Learning for Individualized Decision Making

no code implementations17 Nov 2021 Hengrui Cai, Chengchun Shi, Rui Song, Wenbin Lu

To derive an optimal I2DR, our jump interval-learning method estimates the conditional mean of the outcome given the treatment and the covariates via jump penalized regression, and derives the corresponding optimal I2DR based on the estimated outcome regression function.

Decision Making regression

A Minimax Learning Approach to Off-Policy Evaluation in Confounded Partially Observable Markov Decision Processes

1 code implementation12 Nov 2021 Chengchun Shi, Masatoshi Uehara, Jiawei Huang, Nan Jiang

In this work, we first propose novel identification methods for OPE in POMDPs with latent confounders, by introducing bridge functions that link the target policy's value and the observed data distribution.

Off-policy evaluation

Testing Directed Acyclic Graph via Structural, Supervised and Generative Adversarial Learning

1 code implementation2 Jun 2021 Chengchun Shi, Yunzhe Zhou, Lexin Li

In this article, we propose a new hypothesis testing method for directed acyclic graph (DAG).

Additive models

Deeply-Debiased Off-Policy Interval Estimation

1 code implementation10 May 2021 Chengchun Shi, Runzhe Wan, Victor Chernozhukov, Rui Song

Off-policy evaluation learns a target policy's value with a historical dataset generated by a different behavior policy.

Off-policy evaluation

A REINFORCEMENT LEARNING FRAMEWORK FOR TIME DEPENDENT CAUSAL EFFECTS EVALUATION IN A/B TESTING

no code implementations1 Jan 2021 Chengchun Shi, Xiaoyu Wang, Shikai Luo, Rui Song, Hongtu Zhu, Jieping Ye

A/B testing, or online experiment is a standard business strategy to compare a new product with an old one in pharmaceutical, technological, and traditional industries.

Reinforcement Learning (RL)

Double Generative Adversarial Networks for Conditional Independence Testing

1 code implementation3 Jun 2020 Chengchun Shi, Tianlin Xu, Wicher Bergsma, Lexin Li

In this article, we study the problem of high-dimensional conditional independence testing, a key building block in statistics and machine learning.

Dynamic Causal Effects Evaluation in A/B Testing with a Reinforcement Learning Framework

1 code implementation5 Feb 2020 Chengchun Shi, Xiaoyu Wang, Shikai Luo, Hongtu Zhu, Jieping Ye, Rui Song

A/B testing, or online experiment is a standard business strategy to compare a new product with an old one in pharmaceutical, technological, and traditional industries.

reinforcement-learning Reinforcement Learning (RL)

Robust Learning for Optimal Treatment Decision with NP-Dimensionality

no code implementations15 Oct 2015 Chengchun Shi, Rui Song, Wenbin Lu

In this paper, we propose a two-step estimation procedure for deriving the optimal treatment regime under NP dimensionality.

Cannot find the paper you are looking for? You can Submit a new open access paper.