Offline RL
225 papers with code • 2 benchmarks • 6 datasets
Libraries
Use these libraries to find Offline RL models and implementationsDatasets
Latest papers
DiffClone: Enhanced Behaviour Cloning in Robotics with Diffusion-Driven Policy Learning
The Train-Offline-Test-Online (TOTO) Benchmark provides a well-curated open-source dataset for offline training comprised mostly of expert data and also benchmark scores of the common offline-RL and behaviour cloning agents.
Learning from Sparse Offline Datasets via Conservative Density Estimation
Offline reinforcement learning (RL) offers a promising direction for learning policies from pre-collected datasets without requiring further interactions with the environment.
SPQR: Controlling Q-ensemble Independence with Spiked Random Model for Reinforcement Learning
Alleviating overestimation bias is a critical challenge for deep reinforcement learning to achieve successful performance on more complex tasks or offline datasets containing out-of-distribution data.
Policy-regularized Offline Multi-objective Reinforcement Learning
In this paper, we aim to utilize only offline trajectory data to train a policy for multi-objective RL.
Online Symbolic Music Alignment with Offline Reinforcement Learning
First, in its capacity to identify correct score positions for sampled test contexts; second, as the core technique of a complete algorithm for symbolic online note-wise alignment; and finally, as a real-time symbolic score follower.
PDiT: Interleaving Perception and Decision-making Transformers for Deep Reinforcement Learning
Designing better deep networks and better reinforcement learning (RL) algorithms are both important for deep RL.
A Perspective of Q-value Estimation on Offline-to-Online Reinforcement Learning
In this paper, from a novel perspective, we systematically study the challenges that remain in O2O RL and identify that the reason behind the slow improvement of the performance and the instability of online finetuning lies in the inaccurate Q-value estimation inherited from offline pretraining.
Traffic Signal Control Using Lightweight Transformers: An Offline-to-Online RL Approach
In this work, we propose DTLight, a simple yet powerful lightweight Decision Transformer-based TSC method that can learn policy from easily accessible offline datasets.
The Generalization Gap in Offline Reinforcement Learning
Our experiments reveal that existing offline learning algorithms struggle to match the performance of online RL on both train and test environments.
MICRO: Model-Based Offline Reinforcement Learning with a Conservative Bellman Operator
This method trades off performance and robustness via introducing the robust Bellman operator into the algorithm.