Offline RL

225 papers with code • 2 benchmarks • 6 datasets

This task has no description! Would you like to contribute one?

Benchmarks

Add a Result

These leaderboards are used to track progress in Offline RL

Trend	Dataset	Best Model	Paper	Code	Compare
	D4RL	KFC			See all
	Walker2d	ParPI			See all

Libraries

Use these libraries to find Offline RL models and implementations

zzmtsvv/rl_task

14 papers

yihaosun1124/OfflineRL-Kit

8 papers

224

corl-team/CORL

7 papers

385

opendilab/DI-engine

4 papers

2,515

See all 10 libraries.

Datasets

Subtasks

DQN Replay Dataset

Latest papers

Most implemented Social Latest No code

DiffClone: Enhanced Behaviour Cloning in Robotics with Diffusion-Driven Policy Learning

sirabas369/diffclone • • 17 Jan 2024

The Train-Offline-Test-Online (TOTO) Benchmark provides a well-curated open-source dataset for offline training comprised mostly of expert data and also benchmark scores of the common offline-RL and behaviour cloning agents.

17 Jan 2024

Paper
Code

Learning from Sparse Offline Datasets via Conservative Density Estimation

czp16/cde-offline-rl • • 16 Jan 2024

Offline reinforcement learning (RL) offers a promising direction for learning policies from pre-collected datasets without requiring further interactions with the environment.

16 Jan 2024

Paper
Code

SPQR: Controlling Q-ensemble Independence with Spiked Random Model for Reinforcement Learning

dohyeoklee/SPQR • • NeurIPS 2023

Alleviating overestimation bias is a critical challenge for deep reinforcement learning to achieve successful performance on more complex tasks or offline datasets containing out-of-distribution data.

06 Jan 2024

Paper
Code

Policy-regularized Offline Multi-objective Reinforcement Learning

qianlin04/prmorl • • 4 Jan 2024

In this paper, we aim to utilize only offline trajectory data to train a policy for multi-objective RL.

04 Jan 2024

Paper
Code

Online Symbolic Music Alignment with Offline Reinforcement Learning

sildater/parangonar • 31 Dec 2023

First, in its capacity to identify correct score positions for sampled test contexts; second, as the core technique of a complete algorithm for symbolic online note-wise alignment; and finally, as a real-time symbolic score follower.

31 Dec 2023

Paper
Code

PDiT: Interleaving Perception and Decision-making Transformers for Deep Reinforcement Learning

maohangyu/TIT_open_source • • 26 Dec 2023

Designing better deep networks and better reinforcement learning (RL) algorithms are both important for deep RL.

26 Dec 2023

Paper
Code

A Perspective of Q-value Estimation on Offline-to-Online Reinforcement Learning

opendilab/DI-engine • • 12 Dec 2023

In this paper, from a novel perspective, we systematically study the challenges that remain in O2O RL and identify that the reason behind the slow improvement of the performance and the instability of online finetuning lies in the inaccurate Q-value estimation inherited from offline pretraining.

2,515

12 Dec 2023

Paper
Code

Traffic Signal Control Using Lightweight Transformers: An Offline-to-Online RL Approach

xingshuaihuang/dtlight • • 12 Dec 2023

In this work, we propose DTLight, a simple yet powerful lightweight Decision Transformer-based TSC method that can learn policy from easily accessible offline datasets.

12 Dec 2023

Paper
Code