Offline RL

224 papers with code • 2 benchmarks • 6 datasets

This task has no description! Would you like to contribute one?

Libraries

Use these libraries to find Offline RL models and implementations
14 papers
35
7 papers
384
4 papers
2,513
See all 10 libraries.

Latest papers with no code

Why Online Reinforcement Learning is Causal

no code yet • 7 Mar 2024

Our main argument is that in online learning, conditional probabilities are causal, and therefore offline RL is the setting where causal learning has the most potential to make a difference.

Offline Fictitious Self-Play for Competitive Games

no code yet • 29 Feb 2024

Firstly, unaware of the game structure, it is impossible to interact with the opponents and conduct a major learning paradigm, self-play, for competitive games.

Trajectory-wise Iterative Reinforcement Learning Framework for Auto-bidding

no code yet • 23 Feb 2024

The trained policy can subsequently be deployed for further data collection, resulting in an iterative training framework, which we refer to as iterative offline RL.

Align Your Intents: Offline Imitation Learning via Optimal Transport

no code yet • 20 Feb 2024

We report that AILOT outperforms state-of-the art offline imitation learning algorithms on D4RL benchmarks and improves the performance of other offline RL algorithms in the sparse-reward tasks.

Offline Multi-task Transfer RL with Representational Penalization

no code yet • 19 Feb 2024

We study the problem of representation transfer in offline Reinforcement Learning (RL), where a learner has access to episodic data from a number of source tasks collected a priori, and aims to learn a shared representation to be used in finding a good policy for a target task.

Goal-Conditioned Offline Reinforcement Learning via Metric Learning

no code yet • 16 Feb 2024

Experimentally, we show how our method consistently outperforms other offline RL baselines in learning from sub-optimal offline datasets.

Reward Poisoning Attack Against Offline Reinforcement Learning

no code yet • 15 Feb 2024

To the best of our knowledge, we propose the first black-box reward poisoning attack in the general offline RL setting.

Measurement Scheduling for ICU Patients with Offline Reinforcement Learning

no code yet • 12 Feb 2024

Scheduling laboratory tests for ICU patients presents a significant challenge.

More Benefits of Being Distributional: Second-Order Bounds for Reinforcement Learning

no code yet • 11 Feb 2024

Second-order bounds are instance-dependent bounds that scale with the variance of return, which we prove are tighter than the previously known small-loss bounds of distributional RL.

Offline Actor-Critic Reinforcement Learning Scales to Large Models

no code yet • 8 Feb 2024

We show that offline actor-critic reinforcement learning can scale to large models - such as transformers - and follows similar scaling laws as supervised learning.