Offline RL

227 papers with code • 2 benchmarks • 6 datasets

This task has no description! Would you like to contribute one?

Libraries

Use these libraries to find Offline RL models and implementations
14 papers
38
7 papers
395
4 papers
2,577
See all 10 libraries.

Most implemented papers

Model-Based Offline Reinforcement Learning with Pessimism-Modulated Dynamics Belief

huawei-noah/HEBO 13 Oct 2022

To make practical, we further devise an offline RL algorithm to approximately find the solution.

Extreme Q-Learning: MaxEnt RL without Entropy

div99/xql 5 Jan 2023

Using EVT, we derive our \emph{Extreme Q-Learning} framework and consequently online and, for the first time, offline MaxEnt Q-learning algorithms, that do not explicitly require access to a policy or its entropy.

Offline RL with No OOD Actions: In-Sample Learning via Implicit Value Regularization

ryanxhr/ivr 28 Mar 2023

This gives a deeper understanding of why the in-sample learning paradigm works, i. e., it applies implicit value regularization to the policy.

MOReL : Model-Based Offline Reinforcement Learning

SwapnilPande/MOReL 12 May 2020

In this work, we present MOReL, an algorithmic framework for model-based offline RL.

RL Unplugged: A Suite of Benchmarks for Offline Reinforcement Learning

deepmind/deepmind-research 24 Jun 2020

We hope that our suite of benchmarks will increase the reproducibility of experiments and make it possible to study challenging tasks with a limited computational budget, thus making RL research both more systematic and more accessible across the community.

Offline Meta-Reinforcement Learning with Advantage Weighting

eric-mitchell/macaw 13 Aug 2020

That is, in offline meta-RL, we meta-train on fixed, pre-collected data from several tasks in order to adapt to a new task with a very small amount (less than 5 trajectories) of data from the new task.

DeepAveragers: Offline Reinforcement Learning by Solving Derived Non-Parametric MDPs

maximecb/gym-miniworld ICLR 2021

We study an approach to offline reinforcement learning (RL) based on optimally solving finitely-represented MDPs derived from a static dataset of experience.

Offline Reinforcement Learning with Fisher Divergence Critic Regularization

google-research/google-research 14 Mar 2021

Many modern approaches to offline Reinforcement Learning (RL) utilize behavior regularization, typically augmenting a model-free actor critic algorithm with a penalty measuring divergence of the policy from the offline data.

Online and Offline Reinforcement Learning by Planning with a Learned Model

DHDev0/Muzero-unplugged NeurIPS 2021

Combining Reanalyse with the MuZero algorithm, we introduce MuZero Unplugged, a single unified algorithm for any data budget, including offline RL.

Uncertainty Weighted Actor-Critic for Offline Reinforcement Learning

apple/ml-uwac 17 May 2021

Offline Reinforcement Learning promises to learn effective policies from previously-collected, static datasets without the need for exploration.