Methodology

Q-Learning

388 papers with code • 0 benchmarks • 2 datasets

The goal of Q-learning is to learn a policy, which tells an agent what action to take under what circumstances.

( Image credit: Playing Atari with Deep Reinforcement Learning )

Benchmarks

Add a Result

These leaderboards are used to track progress in Q-Learning

No evaluation results yet. Help compare methods by submitting evaluation metrics.

Libraries

Use these libraries to find Q-Learning models and implementations

opendilab/DI-engine

6 papers

2,548

zzmtsvv/rl_task

6 papers

hill-a/stable-baselines

5 papers

4,042

toni-sm/skrl

5 papers

403

See all 29 libraries.

Datasets

Most implemented papers

Most implemented Social Latest No code

Continuous control with deep reinforcement learning

ray-project/ray • 9 Sep 2015

We adapt the ideas underlying the success of Deep Q-Learning to the continuous action domain.

157

Paper
Code

Playing Atari with Deep Reinforcement Learning

labmlai/annotated_deep_learning_paper_implementations • • 19 Dec 2013

We present the first deep learning model to successfully learn control policies directly from high-dimensional sensory input using reinforcement learning.

111

Paper
Code

Deep Reinforcement Learning with Double Q-learning

labmlai/annotated_deep_learning_paper_implementations • • 22 Sep 2015

The popular Q-learning algorithm is known to overestimate action values under certain conditions.

Paper
Code

Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments

openai/multiagent-particle-envs • NeurIPS 2017

We explore deep reinforcement learning methods for multi-agent domains.

Paper
Code

Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor

haarnoja/sac • • ICML 2018

A platform for Applied Reinforcement Learning (Applied RL)

Paper
Code

Addressing Function Approximation Error in Actor-Critic Methods

sfujim/TD3 • • ICML 2018

In value-based reinforcement learning methods such as deep Q-learning, function approximation errors are known to lead to overestimated value estimates and suboptimal policies.

Paper
Code

Evolution Strategies as a Scalable Alternative to Reinforcement Learning

ray-project/ray • 10 Mar 2017

We explore the use of Evolution Strategies (ES), a class of black box optimization algorithms, as an alternative to popular MDP-based RL techniques such as Q-learning and Policy Gradients.

Paper
Code

Conservative Q-Learning for Offline Reinforcement Learning

aviralkumar2907/CQL • • NeurIPS 2020

We theoretically show that CQL produces a lower bound on the value of the current policy and that it can be incorporated into a policy learning procedure with theoretical improvement guarantees.

Paper
Code

A disembodied developmental robotic agent called Samu Bátfai

nbatfai/isaac • 9 Nov 2015

The basic objective of this paper is to reach the same results using reinforcement learning with general function approximators that can be achieved by using the classical Q lookup table on small input samples.

Paper
Code

Offline Reinforcement Learning with Implicit Q-Learning

rail-berkeley/rlkit • • 12 Oct 2021

The main insight in our work is that, instead of evaluating unseen actions from the latest policy, we can approximate the policy improvement step implicitly by treating the state value function as a random variable, with randomness determined by the action (while still integrating over the dynamics to avoid excessive optimism), and then taking a state conditional upper expectile of this random variable to estimate the value of the best actions in that state.

Paper
Code

Q-Learning

Benchmarks Add a Result

Libraries

Datasets

Most implemented papers

Content

Benchmarks

Add a Result