Q-Learning

388 papers with code • 0 benchmarks • 2 datasets

The goal of Q-learning is to learn a policy, which tells an agent what action to take under what circumstances.

( Image credit: Playing Atari with Deep Reinforcement Learning )

Libraries

Use these libraries to find Q-Learning models and implementations
6 papers
2,574
6 papers
38
5 papers
405
See all 29 libraries.

Logical Specifications-guided Dynamic Task Sampling for Reinforcement Learning Agents

shukla-yash/lsts-icaps-24 6 Feb 2024

In this work, we propose a novel approach, called Logical Specifications-guided Dynamic Task Sampling (LSTS), that learns a set of RL policies to guide an agent from an initial state to a goal state based on a high-level task specification, while minimizing the number of environmental interactions.

0
06 Feb 2024

RadDQN: a Deep Q Learning-based Architecture for Finding Time-efficient Minimum Radiation Exposure Pathway

biswajitsadhu/raddqn 1 Feb 2024

However, the lack of efficient reward function and effective exploration strategy thwarted its implementation in the development of radiation-aware autonomous unmanned aerial vehicle (UAV) for achieving maximum radiation protection.

3
01 Feb 2024

VQC-Based Reinforcement Learning with Data Re-uploading: Performance and Trainability

rodrigocoelho7/vqc_qlearning 21 Jan 2024

This work empirically studies the performance and trainability of such VQC-based Deep Q-Learning models in classic control benchmark environments.

1
21 Jan 2024

A Semantic-Aware Multiple Access Scheme for Distributed, Dynamic 6G-Based Applications

hamidreza-mazandarani/SAMA-D3QL 12 Jan 2024

The emergence of the semantic-aware paradigm presents opportunities for innovative services, especially in the context of 6G-based applications.

3
12 Jan 2024

Decision Making in Non-Stationary Environments with Policy-Augmented Search

scope-lab-vu/PAMCTS 6 Jan 2024

In this paper, we introduce \textit{Policy-Augmented Monte Carlo tree search} (PA-MCTS), which combines action-value estimates from an out-of-date policy with an online search using an up-to-date model of the environment.

4
06 Jan 2024

SPQR: Controlling Q-ensemble Independence with Spiked Random Model for Reinforcement Learning

dohyeoklee/SPQR NeurIPS 2023

Alleviating overestimation bias is a critical challenge for deep reinforcement learning to achieve successful performance on more complex tasks or offline datasets containing out-of-distribution data.

2
06 Jan 2024

Investigating the Performance and Reliability, of the Q-Learning Algorithm in Various Unknown Environments

amirhnourian/Open_AI_Frozenlake 2023 11th RSI International Conference on Robotics and Mechatronics (ICRoM) 2023

As previously indicated, the majority of the conclusions of this study about the relationship between computation cost and environment and also dependability can be transferred to more sophisticated temporal difference-based algorithms because all methods are iterative.

1
19 Dec 2023

Sample Efficient Reinforcement Learning with Partial Dynamics Knowledge

meshal-h/ucb-f 19 Dec 2023

In the setting of finite episodic Markov decision processes with $S$ states, $A$ actions, and episode length $H$, we present an optimistic Q-learning algorithm that achieves $\tilde{\mathcal{O}}(\text{Poly}(H)\sqrt{T})$ regret under perfect knowledge of $f$, where $T$ is the total number of interactions with the system.

1
19 Dec 2023

I Open at the Close: A Deep Reinforcement Learning Evaluation of Open Streets Initiatives

rtealwitter/openstreets 12 Dec 2023

In order to simulate the impact of opening streets, we first compare models for predicting vehicle collisions given network and temporal data.

3
12 Dec 2023

Efficient Sparse-Reward Goal-Conditioned Reinforcement Learning with a High Replay Ratio and Regularization

takuyahiraoka/efficient-srgc-rl-with-a-high-rr-and-regularization 10 Dec 2023

The simplified REDQ with our modifications achieves $\sim 8 \times$ better sample efficiency than the SoTA methods in 4 Fetch tasks of Robotics.

1
10 Dec 2023