About

Benchmarks

No evaluation results yet. Help compare methods by submit evaluation metrics.

Subtasks

Datasets

Latest papers with code

COMBO: Conservative Offline Model-Based Policy Optimization

16 Feb 2021Polixir/OfflineRL

We overcome this limitation by developing a new model-based offline RL algorithm, COMBO, that regularizes the value function on out-of-support state-action tuples generated via rollouts under the learned model.

OFFLINE RL

0
16 Feb 2021

NeoRL: A Near Real-World Benchmark for Offline Reinforcement Learning

1 Feb 2021polixir/NeoRL

We evaluate existing offline RL algorithms on NeoRL and argue that the performance of a policy should also be compared with the deterministic version of the behavior policy, instead of the dataset reward.

OFFLINE RL

9
01 Feb 2021

POPO: Pessimistic Offline Policy Optimization

26 Dec 2020sweetice/POPO

Offline reinforcement learning (RL), also known as batch RL, aims to optimize policy from a large pre-recorded dataset without interaction with the environment.

OFFLINE RL Q-LEARNING

0
26 Dec 2020

Batch Exploration with Examples for Scalable Robotic Reinforcement Learning

22 Oct 2020stanford-iris-lab/batch-exploration

Concretely, we propose an exploration technique, Batch Exploration with Examples (BEE), that explores relevant regions of the state-space, guided by a modest number of human provided images of important states.

OFFLINE RL

3
22 Oct 2020

DeepAveragers: Offline Reinforcement Learning by Solving Derived Non-Parametric MDPs

ICLR 2021 maximecb/gym-miniworld

We study an approach to offline reinforcement learning (RL) based on optimally solving finitely-represented MDPs derived from a static dataset of experience.

OFFLINE RL

331
18 Oct 2020

Human-centric Dialog Training via Offline Reinforcement Learning

EMNLP 2020 natashamjaques/neural_chat

We start by hosting models online, and gather human feedback from real-time, open-ended conversations, which we then use to train and improve the models using offline reinforcement learning (RL).

LANGUAGE MODELLING OFFLINE RL

136
12 Oct 2020

Efficient Fully-Offline Meta-Reinforcement Learning via Distance Metric Learning and Behavior Regularization

ICLR 2021 FOCAL-ICLR/FOCAL-ICLR

In this work, we enforce behavior regularization on learned policy as a general approach to offline RL, combined with a deterministic context encoder for efficient task inference.

META REINFORCEMENT LEARNING METRIC LEARNING OFFLINE RL

2
02 Oct 2020

Conservative Q-Learning for Offline Reinforcement Learning

NeurIPS 2020 aviralkumar2907/CQL

We theoretically show that CQL produces a lower bound on the value of the current policy and that it can be incorporated into a policy learning procedure with theoretical improvement guarantees.

CONTINUOUS CONTROL DQN REPLAY DATASET Q-LEARNING

94
08 Jun 2020

Deployment-Efficient Reinforcement Learning via Model-Based Offline Optimization

ICLR 2021 matsuolab/BREMEN

We propose a novel model-based algorithm, Behavior-Regularized Model-ENsemble (BREMEN) that can effectively optimize a policy offline using 10-20 times fewer data than prior works.

OFFLINE RL

24
05 Jun 2020

Acme: A Research Framework for Distributed Reinforcement Learning

1 Jun 2020deepmind/acme

Ultimately, we show that the design decisions behind Acme lead to agents that can be scaled both up and down and that, for the most part, greater levels of parallelization result in agents with equivalent performance, just faster.

DQN REPLAY DATASET

1,961
01 Jun 2020