Q-Learning
386 papers with code • 0 benchmarks • 2 datasets
The goal of Q-learning is to learn a policy, which tells an agent what action to take under what circumstances.
( Image credit: Playing Atari with Deep Reinforcement Learning )
Benchmarks
These leaderboards are used to track progress in Q-Learning
Libraries
Use these libraries to find Q-Learning models and implementationsLatest papers
Superior Genetic Algorithms for the Target Set Selection Problem Based on Power-Law Parameter Choices and Simple Greedy Heuristics
Besides providing a superior algorithm for the TSS problem, this work shows that randomized parameter choices and elementary greedy heuristics can give better results than complex algorithms and costly parameter tuning.
Laser Learning Environment: A new environment for coordination-critical multi-agent tasks
We introduce the Laser Learning Environment (LLE), a collaborative multi-agent reinforcement learning environment in which coordination is central.
From Two-Dimensional to Three-Dimensional Environment with Q-Learning: Modeling Autonomous Navigation with Reinforcement Learning and no Libraries
This study explores the performance of RL agents in both two-dimensional (2D) and three-dimensional (3D) environments, aiming to research the dynamics of learning across different spatial dimensions.
Ensembling Prioritized Hybrid Policies for Multi-agent Pathfinding
We first propose a selective communication block to gather richer information for better agent coordination within multi-agent environments and train the model with a Q-learning-based algorithm.
Scalable Online Exploration via Coverability
We propose exploration objectives -- policy optimization objectives that enable downstream maximization of any reward function -- as a conceptual framework to systematize the study of exploration.
Belief-Enriched Pessimistic Q-Learning against Adversarial State Perturbations
Existing solutions either introduce a regularization term to improve the smoothness of the trained policy against perturbations or alternatively train the agent's policy and the attacker's policy.
Efficient Episodic Memory Utilization of Cooperative Multi-Agent Reinforcement Learning
To address this, we introduce Efficient episodic Memory Utilization (EMU) for MARL, with two primary objectives: (a) accelerating reinforcement learning by leveraging semantically coherent memory from an episodic buffer and (b) selectively promoting desirable transitions to prevent local convergence.
Leveraging Digital Cousins for Ensemble Q-Learning in Large-Scale Wireless Networks
Herein, a novel ensemble Q-learning algorithm that addresses the performance and complexity challenges of the traditional Q-learning algorithm for optimizing wireless networks is presented.
Multi-Timescale Ensemble Q-learning for Markov Decision Process Policy Optimization
Reinforcement learning (RL) is a classical tool to solve network control or policy optimization problems in unknown environments.
Logical Specifications-guided Dynamic Task Sampling for Reinforcement Learning Agents
In this work, we propose a novel approach, called Logical Specifications-guided Dynamic Task Sampling (LSTS), that learns a set of RL policies to guide an agent from an initial state to a goal state based on a high-level task specification, while minimizing the number of environmental interactions.