Search Results for author: Liangyu Zhang

Found 7 papers, 2 papers with code

Near Minimax-Optimal Distributional Temporal Difference Algorithms and The Freedman Inequality in Hilbert Spaces

no code implementations • 9 Mar 2024 • Yang Peng, Liangyu Zhang, Zhihua Zhang

In the tabular case, \citet{rowland2018analysis} and \citet{rowland2023analysis} proved the asymptotic convergence of two instances of distributional TD, namely categorical temporal difference algorithm (CTD) and quantile temporal difference algorithm (QTD), respectively.

Distributional Reinforcement Learning

Paper
Add Code

Estimation and Inference in Distributional Reinforcement Learning

1 code implementation • 29 Sep 2023 • Liangyu Zhang, Yang Peng, Jiadong Liang, Wenhao Yang, Zhihua Zhang

This implies the distributional policy evaluation problem can be solved with sample efficiency.

Distributional Reinforcement Learning reinforcement-learning

Paper
Code

Semi-Infinitely Constrained Markov Decision Processes and Efficient Reinforcement Learning

1 code implementation • 29 Apr 2023 • Liangyu Zhang, Yang Peng, Wenhao Yang, Zhihua Zhang

To the best of our knowledge, we are the first to apply tools from semi-infinitely programming (SIP) to solve constrained reinforcement learning problems.

Decision Making Model-based Reinforcement Learning +1

Paper
Code

Statistical Estimation of Confounded Linear MDPs: An Instrumental Variable Approach

no code implementations • 12 Sep 2022 • Miao Lu, Wenhao Yang, Liangyu Zhang, Zhihua Zhang

Specifically, we propose a two-stage estimator based on the instrumental variables and establish its statistical properties in the confounded MDPs with a linear structure.

Off-policy evaluation

Paper
Add Code

Towards Theoretical Understandings of Robust Markov Decision Processes: Sample Complexity and Asymptotics

no code implementations • 9 May 2021 • Wenhao Yang, Liangyu Zhang, Zhihua Zhang

In this paper, we study the non-asymptotic and asymptotic performances of the optimal robust policy and value function of robust Markov Decision Processes(MDPs), where the optimal robust policy and value function are solved only from a generative model.

Paper
Add Code

Intervention Generative Adversarial Nets

no code implementations • 1 Jan 2021 • Jiadong Liang, Liangyu Zhang, Cheng Zhang, Zhihua Zhang

In this paper we propose a novel approach for stabilizing the training process of Generative Adversarial Networks as well as alleviating the mode collapse problem.

Paper
Add Code

Intervention Generative Adversarial Networks

no code implementations • 9 Aug 2020 • Jiadong Liang, Liangyu Zhang, Cheng Zhang, Zhihua Zhang

In this paper we propose a novel approach for stabilizing the training process of Generative Adversarial Networks as well as alleviating the mode collapse problem.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.