Search Results for author: Zhipeng Liang

Found 9 papers, 4 papers with code

Reweighted Mixup for Subpopulation Shift

no code implementations • 9 Apr 2023 • Zongbo Han, Zhipeng Liang, Fan Yang, Liu Liu, Lanqing Li, Yatao Bian, Peilin Zhao, QinGhua Hu, Bingzhe Wu, Changqing Zhang, Jianhua Yao

Subpopulation shift exists widely in many real-world applications, which refers to the training and test distributions that contain the same subpopulation groups but with different subpopulation proportions.

Fairness Generalization Bounds

Paper
Add Code

Single-Trajectory Distributionally Robust Reinforcement Learning

no code implementations • 27 Jan 2023 • Zhipeng Liang, Xiaoteng Ma, Jose Blanchet, Jiheng Zhang, Zhengyuan Zhou

As a framework for sequential decision-making, Reinforcement Learning (RL) has been regarded as an essential component leading to Artificial General Intelligence (AGI).

Decision Making Q-Learning +2

Paper
Add Code

Vertical Federated Linear Contextual Bandits

no code implementations • 20 Oct 2022 • Zeyu Cao, Zhipeng Liang, Shu Zhang, Hangyu Li, Ouyang Wen, Yu Rong, Peilin Zhao, Bingzhe Wu

In this paper, we investigate a novel problem of building contextual bandits in the vertical federated setting, i. e., contextual information is vertically distributed over different departments.

Multi-Armed Bandits

Paper
Add Code

UMIX: Improving Importance Weighting for Subpopulation Shift via Uncertainty-Aware Mixup

1 code implementation • 19 Sep 2022 • Zongbo Han, Zhipeng Liang, Fan Yang, Liu Liu, Lanqing Li, Yatao Bian, Peilin Zhao, Bingzhe Wu, Changqing Zhang, Jianhua Yao

Importance reweighting is a normal way to handle the subpopulation shift issue by imposing constant or adaptive sampling weights on each sample in the training dataset.

Generalization Bounds

Paper
Code

Distributionally Robust Offline Reinforcement Learning with Linear Function Approximation

no code implementations • 14 Sep 2022 • Xiaoteng Ma, Zhipeng Liang, Jose Blanchet, Mingwen Liu, Li Xia, Jiheng Zhang, Qianchuan Zhao, Zhengyuan Zhou

Among the reasons hindering reinforcement learning (RL) applications to real-world problems, two factors are critical: limited data and the mismatch between the testing environment (real environment in which the policy is deployed) and the training environment (e. g., a simulator).

Offline RL reinforcement-learning +1

Paper
Add Code

On Private Online Convex Optimization: Optimal Algorithms in $\ell_p$-Geometry and High Dimensional Contextual Bandits

1 code implementation • 16 Jun 2022 • Yuxuan Han, Zhicong Liang, Zhipeng Liang, Yang Wang, Yuan YAO, Jiheng Zhang

To address such a challenge as the online convex optimization with privacy protection, we propose a private variant of online Frank-Wolfe algorithm with recursive gradients for variance reduction to update and reveal the parameters upon each data.

Multi-Armed Bandits

Paper
Code

DRFLM: Distributionally Robust Federated Learning with Inter-client Noise via Local Mixup

no code implementations • 16 Apr 2022 • Bingzhe Wu, Zhipeng Liang, Yuxuan Han, Yatao Bian, Peilin Zhao, Junzhou Huang

In this paper, we propose a general framework to solve the above two challenges simultaneously.

Drug Discovery Federated Learning +1

Paper
Add Code

Generalized Linear Bandits with Local Differential Privacy

1 code implementation • NeurIPS 2021 • Yuxuan Han, Zhipeng Liang, Yang Wang, Jiheng Zhang

In this paper, we design LDP algorithms for stochastic generalized linear bandits to achieve the same regret bound as in non-privacy settings.

Decision Making Multi-Armed Bandits

Paper
Code

Adversarial Deep Reinforcement Learning in Portfolio Management

3 code implementations • 29 Aug 2018 • Zhipeng Liang, Hao Chen, Junhao Zhu, Kangkang Jiang, Yan-ran Li

In this paper, we implement three state-of-art continuous reinforcement learning algorithms, Deep Deterministic Policy Gradient (DDPG), Proximal Policy Optimization (PPO) and Policy Gradient (PG)in portfolio management.

Management reinforcement-learning +1

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.