Search Results for author: Zhipeng Liang

Found 9 papers, 4 papers with code

Reweighted Mixup for Subpopulation Shift

no code implementations9 Apr 2023 Zongbo Han, Zhipeng Liang, Fan Yang, Liu Liu, Lanqing Li, Yatao Bian, Peilin Zhao, QinGhua Hu, Bingzhe Wu, Changqing Zhang, Jianhua Yao

Subpopulation shift exists widely in many real-world applications, which refers to the training and test distributions that contain the same subpopulation groups but with different subpopulation proportions.

Fairness Generalization Bounds

Single-Trajectory Distributionally Robust Reinforcement Learning

no code implementations27 Jan 2023 Zhipeng Liang, Xiaoteng Ma, Jose Blanchet, Jiheng Zhang, Zhengyuan Zhou

As a framework for sequential decision-making, Reinforcement Learning (RL) has been regarded as an essential component leading to Artificial General Intelligence (AGI).

Decision Making Q-Learning +2

Vertical Federated Linear Contextual Bandits

no code implementations20 Oct 2022 Zeyu Cao, Zhipeng Liang, Shu Zhang, Hangyu Li, Ouyang Wen, Yu Rong, Peilin Zhao, Bingzhe Wu

In this paper, we investigate a novel problem of building contextual bandits in the vertical federated setting, i. e., contextual information is vertically distributed over different departments.

Multi-Armed Bandits

UMIX: Improving Importance Weighting for Subpopulation Shift via Uncertainty-Aware Mixup

1 code implementation19 Sep 2022 Zongbo Han, Zhipeng Liang, Fan Yang, Liu Liu, Lanqing Li, Yatao Bian, Peilin Zhao, Bingzhe Wu, Changqing Zhang, Jianhua Yao

Importance reweighting is a normal way to handle the subpopulation shift issue by imposing constant or adaptive sampling weights on each sample in the training dataset.

Generalization Bounds

Distributionally Robust Offline Reinforcement Learning with Linear Function Approximation

no code implementations14 Sep 2022 Xiaoteng Ma, Zhipeng Liang, Jose Blanchet, Mingwen Liu, Li Xia, Jiheng Zhang, Qianchuan Zhao, Zhengyuan Zhou

Among the reasons hindering reinforcement learning (RL) applications to real-world problems, two factors are critical: limited data and the mismatch between the testing environment (real environment in which the policy is deployed) and the training environment (e. g., a simulator).

Offline RL reinforcement-learning +1

On Private Online Convex Optimization: Optimal Algorithms in $\ell_p$-Geometry and High Dimensional Contextual Bandits

1 code implementation16 Jun 2022 Yuxuan Han, Zhicong Liang, Zhipeng Liang, Yang Wang, Yuan YAO, Jiheng Zhang

To address such a challenge as the online convex optimization with privacy protection, we propose a private variant of online Frank-Wolfe algorithm with recursive gradients for variance reduction to update and reveal the parameters upon each data.

Multi-Armed Bandits

Generalized Linear Bandits with Local Differential Privacy

1 code implementation NeurIPS 2021 Yuxuan Han, Zhipeng Liang, Yang Wang, Jiheng Zhang

In this paper, we design LDP algorithms for stochastic generalized linear bandits to achieve the same regret bound as in non-privacy settings.

Decision Making Multi-Armed Bandits

Adversarial Deep Reinforcement Learning in Portfolio Management

3 code implementations29 Aug 2018 Zhipeng Liang, Hao Chen, Junhao Zhu, Kangkang Jiang, Yan-ran Li

In this paper, we implement three state-of-art continuous reinforcement learning algorithms, Deep Deterministic Policy Gradient (DDPG), Proximal Policy Optimization (PPO) and Policy Gradient (PG)in portfolio management.

Management reinforcement-learning +1

Cannot find the paper you are looking for? You can Submit a new open access paper.