Q-Learning

381 papers with code • 0 benchmarks • 2 datasets

The goal of Q-learning is to learn a policy, which tells an agent what action to take under what circumstances.

( Image credit: Playing Atari with Deep Reinforcement Learning )

Libraries

Use these libraries to find Q-Learning models and implementations
6 papers
2,403
6 papers
31
5 papers
385
See all 29 libraries.

Most implemented papers

Offline Reinforcement Learning with Implicit Q-Learning

rail-berkeley/rlkit 12 Oct 2021

The main insight in our work is that, instead of evaluating unseen actions from the latest policy, we can approximate the policy improvement step implicitly by treating the state value function as a random variable, with randomness determined by the action (while still integrating over the dynamics to avoid excessive optimism), and then taking a state conditional upper expectile of this random variable to estimate the value of the best actions in that state.

ViZDoom: A Doom-based AI Research Platform for Visual Reinforcement Learning

mwydmuch/ViZDoom 6 May 2016

Here, we propose a novel test-bed platform for reinforcement learning research from raw visual information which employs the first-person perspective in a semi-realistic 3D world.

rlpyt: A Research Code Base for Deep Reinforcement Learning in PyTorch

astooke/rlpyt 3 Sep 2019

rlpyt is designed as a high-throughput code base for small- to medium-scale research in deep RL.

Continuous Deep Q-Learning with Model-based Acceleration

jakegrigsby/deep_control 2 Mar 2016

In this paper, we explore algorithms and representations to reduce the sample complexity of deep reinforcement learning for continuous control tasks.

Playing FPS Games with Deep Reinforcement Learning

glample/Arnold 18 Sep 2016

Advances in deep reinforcement learning have allowed autonomous agents to perform well on Atari games, often outperforming humans, using only raw pixels to make their decisions.

Optimization of Molecules via Deep Reinforcement Learning

google-research/google-research 19 Oct 2018

We present a framework, which we call Molecule Deep $Q$-Networks (MolDQN), for molecule optimization by combining domain knowledge of chemistry and state-of-the-art reinforcement learning techniques (double $Q$-learning and randomized value functions).

DeepTraffic: Crowdsourced Hyperparameter Tuning of Deep Reinforcement Learning Systems for Multi-Agent Dense Traffic Navigation

lexfridman/deeptraffic 9 Jan 2018

We present a traffic simulation named DeepTraffic where the planning systems for a subset of the vehicles are handled by a neural network as part of a model-free, off-policy reinforcement learning process.

Randomized Ensembled Double Q-Learning: Learning Fast Without a Model

watchernyu/REDQ ICLR 2021

Using a high Update-To-Data (UTD) ratio, model-based methods have recently achieved much higher sample efficiency than previous model-free methods for continuous-action DRL benchmarks.

Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition

borea17/efficient_rl 21 May 1999

The paper presents an online model-free learning algorithm, MAXQ-Q, and proves that it converges wih probability 1 to a kind of locally-optimal policy known as a recursively optimal policy, even in the presence of the five kinds of state abstraction.

Deep Recurrent Q-Learning for Partially Observable MDPs

marload/DeepRL-TensorFlow2 23 Jul 2015

Deep Reinforcement Learning has yielded proficient controllers for complex tasks.