Search Results for author: Chengchun Shi

Found 36 papers, 18 papers with code

An Analysis of Switchback Designs in Reinforcement Learning

no code implementations • 26 Mar 2024 • Qianglin Wen, Chengchun Shi, Ying Yang, Niansheng Tang, Hongtu Zhu

Our aim is to thoroughly evaluate the effects of these designs on the accuracy of their resulting average treatment effect (ATE) estimators.

reinforcement-learning

Paper
Add Code

Pessimistic Causal Reinforcement Learning with Mediators for Confounded Offline Data

no code implementations • 18 Mar 2024 • Danyang Wang, Chengchun Shi, Shikai Luo, Will Wei Sun

As a result, leveraging large observational datasets becomes a more attractive option for achieving high-quality policy learning.

reinforcement-learning Reinforcement Learning (RL) +1

Paper
Add Code

Robust Offline Reinforcement learning with Heavy-Tailed Rewards

1 code implementation • 28 Oct 2023 • Jin Zhu, Runzhe Wan, Zhengling Qi, Shikai Luo, Chengchun Shi

This paper endeavors to augment the robustness of offline reinforcement learning (RL) in scenarios laden with heavy-tailed rewards, a prevalent circumstance in real-world applications.

Offline RL Off-policy evaluation +1

Paper
Code

Off-policy Evaluation in Doubly Inhomogeneous Environments

no code implementations • 14 Jun 2023 • Zeyu Bian, Chengchun Shi, Zhengling Qi, Lan Wang

This work aims to study off-policy evaluation (OPE) under scenarios where two key reinforcement learning (RL) assumptions -- temporal stationarity and individual homogeneity are both violated.

Offline RL Off-policy evaluation

Paper
Add Code

Testing for the Markov Property in Time Series via Deep Conditional Generative Learning

1 code implementation • 30 May 2023 • Yunzhe Zhou, Chengchun Shi, Lexin Li, Qiwei Yao

In this article, we propose a nonparametric test for the Markov property in high-dimensional time series via deep conditional generative learning.

Time Series

Paper
Code

Evaluating Dynamic Conditional Quantile Treatment Effects with Applications in Ridesharing

no code implementations • 17 May 2023 • Ting Li, Chengchun Shi, Zhaohua Lu, Yi Li, Hongtu Zhu

However, assessing dynamic quantile treatment effects (QTE) remains a challenge, particularly when dealing with data from ride-sourcing platforms that involve sequential decision-making across time and space.

Decision Making

Paper
Add Code

Sequential Knockoffs for Variable Selection in Reinforcement Learning

no code implementations • 24 Mar 2023 • Tao Ma, Hengrui Cai, Zhengling Qi, Chengchun Shi, Eric B. Laber

In real-world applications of reinforcement learning, it is often challenging to obtain a state representation that is parsimonious and satisfies the Markov property without prior knowledge.

reinforcement-learning Variable Selection

Paper
Add Code

A Reinforcement Learning Framework for Dynamic Mediation Analysis

1 code implementation • 31 Jan 2023 • Lin Ge, Jitao Wang, Chengchun Shi, Zhenke Wu, Rui Song

However, there are a number of applications (e. g., mobile health) where the treatments are sequentially assigned over time and the dynamic mediation effects are of primary interest.

reinforcement-learning Reinforcement Learning (RL)

Paper
Code

Value Enhancement of Reinforcement Learning via Efficient and Robust Trust Region Optimization

no code implementations • 5 Jan 2023 • Chengchun Shi, Zhengling Qi, Jianing Wang, Fan Zhou

When the initial policy is consistent, under some mild conditions, our method will yield a policy whose value converges to the optimal one at a faster rate than the initial policy, achieving the desired ``value enhancement" property.

Decision Making reinforcement-learning +1

Paper
Add Code

Deep Spectral Q-learning with Application to Mobile Health

no code implementations • 3 Jan 2023 • Yuhe Gao, Chengchun Shi, Rui Song

Dynamic treatment regimes assign personalized treatments to patients sequentially over time based on their baseline information and time-varying covariates.

Q-Learning

Paper
Add Code

Quantile Off-Policy Evaluation via Deep Conditional Generative Learning

no code implementations • 29 Dec 2022 • Yang Xu, Chengchun Shi, Shikai Luo, Lan Wang, Rui Song

Off-Policy evaluation (OPE) is concerned with evaluating a new target policy using offline data generated by a potentially different behavior policy.

Decision Making Off-policy evaluation

Paper
Add Code

An Instrumental Variable Approach to Confounded Off-Policy Evaluation

no code implementations • 29 Dec 2022 • Yang Xu, Jin Zhu, Chengchun Shi, Shikai Luo, Rui Song

Off-policy evaluation (OPE) is a method for estimating the return of a target policy using some pre-collected observational data generated by a potentially different behavior policy.

Decision Making Off-policy evaluation

Paper
Add Code

A Review of Off-Policy Evaluation in Reinforcement Learning

no code implementations • 13 Dec 2022 • Masatoshi Uehara, Chengchun Shi, Nathan Kallus

Reinforcement learning (RL) is one of the most vibrant research frontiers in machine learning and has been recently applied to solve a number of challenging problems.

Off-policy evaluation reinforcement-learning

Paper
Add Code

Doubly Inhomogeneous Reinforcement Learning

1 code implementation • 8 Nov 2022 • Liyuan Hu, Mengbing Li, Chengchun Shi, Zhenke Wu, Piotr Fryzlewicz

Moreover, by borrowing information over time and population, it allows us to detect weaker signals and has better convergence properties when compared to applying the clustering algorithm per time or the change point detection algorithm per subject.

Change Point Detection Clustering +3

Paper
Code

Optimizing Pessimism in Dynamic Treatment Regimes: A Bayesian Learning Approach

1 code implementation • 26 Oct 2022 • Yunzhe Zhou, Zhengling Qi, Chengchun Shi, Lexin Li

In this article, we propose a novel pessimism-based Bayesian learning method for optimal dynamic treatment regimes in the offline setting.

Thompson Sampling Variational Inference

Paper
Code

Blessing from Human-AI Interaction: Super Reinforcement Learning in Confounded Environments

no code implementations • 29 Sep 2022 • Jiayi Wang, Zhengling Qi, Chengchun Shi

This approach utilizes the observed action, either from AI or humans, as input for achieving a stronger oracle in policy learning for the decision maker (humans or AI).

Decision Making reinforcement-learning +1

Paper
Add Code

Semi-supervised Batch Learning From Logged Data

no code implementations • 15 Sep 2022 • Gholamali Aminian, Armin Behnamnia, Roberto Vega, Laura Toni, Chengchun Shi, Hamid R. Rabiee, Omar Rivasplata, Miguel R. D. Rodrigues

We propose learning methods for problems where feedback is missing for some samples, so there are samples with feedback and samples missing-feedback in the logged data.

counterfactual

Paper
Add Code

Future-Dependent Value-Based Off-Policy Evaluation in POMDPs

1 code implementation • NeurIPS 2023 • Masatoshi Uehara, Haruka Kiyohara, Andrew Bennett, Victor Chernozhukov, Nan Jiang, Nathan Kallus, Chengchun Shi, Wen Sun

Finally, we extend our methods to learning of dynamics and establish the connection between our approach and the well-known spectral learning methods in POMDPs.

Off-policy evaluation

Paper
Code

Conformal Off-policy Prediction

1 code implementation • 14 Jun 2022 • Yingying Zhang, Chengchun Shi, Shikai Luo

Off-policy evaluation is critical in a number of applications where new policies need to be evaluated offline before online deployment.

Conformal Prediction Off-policy evaluation +2

Paper
Code

Testing Stationarity and Change Point Detection in Reinforcement Learning

1 code implementation • 3 Mar 2022 • Mengbing Li, Chengchun Shi, Zhenke Wu, Piotr Fryzlewicz

Based on the proposed test, we further develop a sequential change point detection method that can be naturally coupled with existing state-of-the-art RL methods for policy optimization in nonstationary environments.

Change Point Detection reinforcement-learning +1

Paper
Code

Statistically Efficient Advantage Learning for Offline Reinforcement Learning in Infinite Horizons

1 code implementation • 26 Feb 2022 • Chengchun Shi, Shikai Luo, Yuan Le, Hongtu Zhu, Rui Song

We consider reinforcement learning (RL) methods in offline domains without additional online data collection, such as mobile health applications.

reinforcement-learning Reinforcement Learning (RL)

Paper
Code

Off-Policy Confidence Interval Estimation with Confounded Markov Decision Process

1 code implementation • 22 Feb 2022 • Chengchun Shi, Jin Zhu, Ye Shen, Shikai Luo, Hongtu Zhu, Rui Song

In this paper, we show that with some auxiliary variables that mediate the effect of actions on the system dynamics, the target policy's value is identifiable in a confounded Markov decision process.

Uncertainty Quantification

Paper
Code

Policy Evaluation for Temporal and/or Spatial Dependent Experiments

no code implementations • 22 Feb 2022 • Shikai Luo, Ying Yang, Chengchun Shi, Fang Yao, Jieping Ye, Hongtu Zhu

The aim of this paper is to establish a causal link between the policies implemented by technology companies and the outcomes they yield within intricate temporal and/or spatial dependent experiments.

Marketing

Paper
Add Code

A Multi-Agent Reinforcement Learning Framework for Off-Policy Evaluation in Two-sided Markets

1 code implementation • 21 Feb 2022 • Chengchun Shi, Runzhe Wan, Ge Song, Shikai Luo, Rui Song, Hongtu Zhu

In this paper we consider large-scale fleet management in ride-sharing companies that involve multiple units in different areas receiving sequences of products (or treatments) over time.

Management Multi-agent Reinforcement Learning +1

Paper
Code

Jump Interval-Learning for Individualized Decision Making

no code implementations • 17 Nov 2021 • Hengrui Cai, Chengchun Shi, Rui Song, Wenbin Lu

To derive an optimal I2DR, our jump interval-learning method estimates the conditional mean of the outcome given the treatment and the covariates via jump penalized regression, and derives the corresponding optimal I2DR based on the estimated outcome regression function.

Decision Making regression

Paper
Add Code

A Minimax Learning Approach to Off-Policy Evaluation in Confounded Partially Observable Markov Decision Processes

1 code implementation • 12 Nov 2021 • Chengchun Shi, Masatoshi Uehara, Jiawei Huang, Nan Jiang

In this work, we first propose novel identification methods for OPE in POMDPs with latent confounders, by introducing bridge functions that link the target policy's value and the observed data distribution.

Off-policy evaluation

Paper
Code

Testing Directed Acyclic Graph via Structural, Supervised and Generative Adversarial Learning

1 code implementation • 2 Jun 2021 • Chengchun Shi, Yunzhe Zhou, Lexin Li

In this article, we propose a new hypothesis testing method for directed acyclic graph (DAG).

Additive models

Paper
Code

Pattern Transfer Learning for Reinforcement Learning in Order Dispatching

no code implementations • 27 May 2021 • Runzhe Wan, Sheng Zhang, Chengchun Shi, Shikai Luo, Rui Song

Order dispatch is one of the central problems to ride-sharing platforms.

reinforcement-learning Reinforcement Learning (RL) +1

Paper
Add Code

Deeply-Debiased Off-Policy Interval Estimation

1 code implementation • 10 May 2021 • Chengchun Shi, Runzhe Wan, Victor Chernozhukov, Rui Song

Off-policy evaluation learns a target policy's value with a historical dataset generated by a different behavior policy.

Off-policy evaluation

Paper
Code

A REINFORCEMENT LEARNING FRAMEWORK FOR TIME DEPENDENT CAUSAL EFFECTS EVALUATION IN A/B TESTING

no code implementations • 1 Jan 2021 • Chengchun Shi, Xiaoyu Wang, Shikai Luo, Rui Song, Hongtu Zhu, Jieping Ye

A/B testing, or online experiment is a standard business strategy to compare a new product with an old one in pharmaceutical, technological, and traditional industries.

Reinforcement Learning (RL)

Paper
Add Code

Deep Jump Learning for Off-Policy Evaluation in Continuous Treatment Settings

1 code implementation • NeurIPS 2021 • Hengrui Cai, Chengchun Shi, Rui Song, Wenbin Lu

To handle continuous treatments, we develop a novel estimation method for OPE using deep jump learning.

Change Point Detection Off-policy evaluation +1

Paper
Code

Deep Jump Q-Evaluation for Offline Policy Evaluation in Continuous Action Space

no code implementations • 28 Sep 2020 • Hengrui Cai, Chengchun Shi, Rui Song, Wenbin Lu

To handle continuous action space, we develop a brand-new deep jump Q-evaluation method for OPE.

Off-policy evaluation Q-Learning

Paper
Add Code

Double Generative Adversarial Networks for Conditional Independence Testing

1 code implementation • 3 Jun 2020 • Chengchun Shi, Tianlin Xu, Wicher Bergsma, Lexin Li

In this article, we study the problem of high-dimensional conditional independence testing, a key building block in statistics and machine learning.

Paper
Code

Dynamic Causal Effects Evaluation in A/B Testing with a Reinforcement Learning Framework

1 code implementation • 5 Feb 2020 • Chengchun Shi, Xiaoyu Wang, Shikai Luo, Hongtu Zhu, Jieping Ye, Rui Song

A/B testing, or online experiment is a standard business strategy to compare a new product with an old one in pharmaceutical, technological, and traditional industries.

reinforcement-learning Reinforcement Learning (RL)

Paper
Code

Does the Markov Decision Process Fit the Data: Testing for the Markov Property in Sequential Decision Making

1 code implementation • ICML 2020 • Chengchun Shi, Runzhe Wan, Rui Song, Wenbin Lu, Ling Leng

The Markov assumption (MA) is fundamental to the empirical validity of reinforcement learning.

Decision Making reinforcement-learning +1

Paper
Code

Robust Learning for Optimal Treatment Decision with NP-Dimensionality

no code implementations • 15 Oct 2015 • Chengchun Shi, Rui Song, Wenbin Lu

In this paper, we propose a two-step estimation procedure for deriving the optimal treatment regime under NP dimensionality.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.