Offline RL

234 papers with code • 2 benchmarks • 7 datasets

This task has no description! Would you like to contribute one?

Libraries

Use these libraries to find Offline RL models and implementations
14 papers
38
7 papers
405
5 papers
1,231
See all 10 libraries.

Offline Data Enhanced On-Policy Policy Gradient with Provable Guarantees

yifeizhou02/hnpg 14 Nov 2023

In this work, we propose a new hybrid RL algorithm that combines an on-policy actor-critic method with offline data.

1
14 Nov 2023

Unleashing the Power of Pre-trained Language Models for Offline Reinforcement Learning

srzer/LaMo-2023 31 Oct 2023

Offline reinforcement learning (RL) aims to find a near-optimal policy using pre-collected datasets.

33
31 Oct 2023

Free from Bellman Completeness: Trajectory Stitching via Model-based Return-conditioned Supervised Learning

zhaoyizhou1123/mbrcsl 30 Oct 2023

Off-policy dynamic programming (DP) techniques such as $Q$-learning have proven to be important in sequential decision-making problems.

8
30 Oct 2023

Robust Offline Reinforcement learning with Heavy-Tailed Rewards

mamba413/room 28 Oct 2023

This paper endeavors to augment the robustness of offline reinforcement learning (RL) in scenarios laden with heavy-tailed rewards, a prevalent circumstance in real-world applications.

4
28 Oct 2023

Bridging Distributionally Robust Learning and Offline RL: An Approach to Mitigate Distribution Shift and Partial Data Coverage

zaiyan-x/drqi 27 Oct 2023

The goal of an offline reinforcement learning (RL) algorithm is to learn optimal polices using historical (offline) data, without access to the environment for online exploration.

1
27 Oct 2023

CROP: Conservative Reward for Model-based Offline Policy Optimization

g0k0ururi/crop 26 Oct 2023

Offline reinforcement learning (RL) aims to optimize policy using collected data without online interactions.

5
26 Oct 2023

Corruption-Robust Offline Reinforcement Learning with General Function Approximation

yangrui2015/uwmsg NeurIPS 2023

Notably, under the assumption of single policy coverage and the knowledge of $\zeta$, our proposed algorithm achieves a suboptimality bound that is worsened by an additive factor of $\mathcal{O}(\zeta (C(\widehat{\mathcal{F}},\mu)n)^{-1})$ due to the corruption.

2
23 Oct 2023

Towards Robust Offline Reinforcement Learning under Diverse Data Corruption

zzmtsvv/ORL 19 Oct 2023

Offline reinforcement learning (RL) presents a promising approach for learning reinforced policies from offline datasets without the need for costly or unsafe interactions with the environment.

38
19 Oct 2023

Building Persona Consistent Dialogue Agents with Offline Reinforcement Learning

ryanshea10/personachat_offline_rl 16 Oct 2023

Our automatic and human evaluations show that our framework improves both the persona consistency and dialogue quality of a state-of-the-art social chatbot.

3
16 Oct 2023

Offline Retraining for Online RL: Decoupled Policy Learning to Mitigate Exploration Bias

MaxSobolMark/OOO 12 Oct 2023

Can we leverage offline RL to recover better policies from online interaction?

15
12 Oct 2023