Offline RL
226 papers with code • 2 benchmarks • 6 datasets
Libraries
Use these libraries to find Offline RL models and implementationsDatasets
Latest papers with no code
More Benefits of Being Distributional: Second-Order Bounds for Reinforcement Learning
Second-order bounds are instance-dependent bounds that scale with the variance of return, which we prove are tighter than the previously known small-loss bounds of distributional RL.
Offline Actor-Critic Reinforcement Learning Scales to Large Models
We show that offline actor-critic reinforcement learning can scale to large models - such as transformers - and follows similar scaling laws as supervised learning.
Federated Offline Reinforcement Learning: Collaborative Single-Policy Coverage Suffices
Our sample complexity analysis reveals that, with appropriately chosen parameters and synchronization schedules, FedLCB-Q achieves linear speedup in terms of the number of agents without requiring high-quality datasets at individual agents, as long as the local datasets collectively cover the state-action space visited by the optimal policy, highlighting the power of collaboration in the federated setting.
Real-World Fluid Directed Rigid Body Control via Deep Reinforcement Learning
Recent advances in real-world applications of reinforcement learning (RL) have relied on the ability to accurately simulate systems at scale.
A Primal-Dual Algorithm for Offline Constrained Reinforcement Learning with Low-Rank MDPs
Our algorithm is the first computationally efficient algorithm in this setting that achieves sample complexity of $O(\epsilon^{-2})$ with partial data coverage assumption.
DiffStitch: Boosting Offline Reinforcement Learning with Diffusion-based Trajectory Stitching
In offline reinforcement learning (RL), the performance of the learned policy highly depends on the quality of offline datasets.
The Virtues of Pessimism in Inverse Reinforcement Learning
Inverse Reinforcement Learning (IRL) is a powerful framework for learning complex behaviors from expert demonstrations.
Towards an Information Theoretic Framework of Context-Based Offline Meta-Reinforcement Learning
As a marriage between offline RL and meta-RL, the advent of offline meta-reinforcement learning (OMRL) has shown great promise in enabling RL agents to multi-task and quickly adapt while acquiring knowledge safely.
Value-Aided Conditional Supervised Learning for Offline RL
Offline reinforcement learning (RL) has seen notable advancements through return-conditioned supervised learning (RCSL) and value-based methods, yet each approach comes with its own set of practical challenges.
Context-Former: Stitching via Latent Conditioned Sequence Modeling
On the other hand, Decision Transformer (DT) abstracts the decision-making as sequence modeling, showcasing competitive performance on offline RL benchmarks, however, recent studies demonstrate that DT lacks of stitching capability, thus exploit stitching capability for DT is vital to further improve its performance.