Methodology

Q-Learning

386 papers with code • 0 benchmarks • 2 datasets

The goal of Q-learning is to learn a policy, which tells an agent what action to take under what circumstances.

( Image credit: Playing Atari with Deep Reinforcement Learning )

Benchmarks

Add a Result

These leaderboards are used to track progress in Q-Learning

No evaluation results yet. Help compare methods by submitting evaluation metrics.

Libraries

Use these libraries to find Q-Learning models and implementations

opendilab/DI-engine

6 papers

2,513

zzmtsvv/rl_task

6 papers

hill-a/stable-baselines

5 papers

4,038

toni-sm/skrl

5 papers

397

See all 29 libraries.

Datasets

Latest papers

Most implemented Social Latest No code

Superior Genetic Algorithms for the Target Set Selection Problem Based on Power-Law Parameter Choices and Simple Greedy Heuristics

faceonlive/ai-research • 5 Apr 2024

Besides providing a superior algorithm for the TSS problem, this work shows that randomized parameter choices and elementary greedy heuristics can give better results than complex algorithms and costly parameter tuning.

132

05 Apr 2024

Paper
Code

Laser Learning Environment: A new environment for coordination-critical multi-agent tasks

yamoling/lle • 4 Apr 2024

We introduce the Laser Learning Environment (LLE), a collaborative multi-agent reinforcement learning environment in which coordination is central.

04 Apr 2024

Paper
Code

From Two-Dimensional to Three-Dimensional Environment with Q-Learning: Modeling Autonomous Navigation with Reinforcement Learning and no Libraries

ergoncugler/q-learning • 27 Mar 2024

This study explores the performance of RL agents in both two-dimensional (2D) and three-dimensional (3D) environments, aiming to research the dynamics of learning across different spatial dimensions.

27 Mar 2024

Paper
Code

Ensembling Prioritized Hybrid Policies for Multi-agent Pathfinding

ai4co/eph-mapf • • 12 Mar 2024

We first propose a selective communication block to gather richer information for better agent coordination within multi-agent environments and train the model with a Q-learning-based algorithm.

12 Mar 2024

Paper
Code

Scalable Online Exploration via Coverability

philip-amortila/l1-coverability • 11 Mar 2024

We propose exploration objectives -- policy optimization objectives that enable downstream maximization of any reward function -- as a conceptual framework to systematize the study of exploration.

11 Mar 2024

Paper
Code

Belief-Enriched Pessimistic Q-Learning against Adversarial State Perturbations

sliencerx/belief-enriched-robust-q-learning • • 6 Mar 2024

Existing solutions either introduce a regularization term to improve the smoothness of the trained policy against perturbations or alternatively train the agent's policy and the attacker's policy.

06 Mar 2024

Paper
Code

Efficient Episodic Memory Utilization of Cooperative Multi-Agent Reinforcement Learning

hyunghona/emu • • 2 Mar 2024

To address this, we introduce Efficient episodic Memory Utilization (EMU) for MARL, with two primary objectives: (a) accelerating reinforcement learning by leveraging semantically coherent memory from an episodic buffer and (b) selectively promoting desirable transitions to prevent local convergence.

02 Mar 2024

Paper
Code

Leveraging Digital Cousins for Ensemble Q-Learning in Large-Scale Wireless Networks

talhabozkus/digital-cousins-for-ensemble-q-learning • 12 Feb 2024

Herein, a novel ensemble Q-learning algorithm that addresses the performance and complexity challenges of the traditional Q-learning algorithm for optimizing wireless networks is presented.

12 Feb 2024

Paper
Code

Multi-Timescale Ensemble Q-learning for Markov Decision Process Policy Optimization

talhabozkus/tsp_23_supplementary_file • 8 Feb 2024

Reinforcement learning (RL) is a classical tool to solve network control or policy optimization problems in unknown environments.

08 Feb 2024

Paper
Code

Logical Specifications-guided Dynamic Task Sampling for Reinforcement Learning Agents

shukla-yash/lsts-icaps-24 • 6 Feb 2024

In this work, we propose a novel approach, called Logical Specifications-guided Dynamic Task Sampling (LSTS), that learns a set of RL policies to guide an agent from an initial state to a goal state based on a high-level task specification, while minimizing the number of environmental interactions.

06 Feb 2024

Paper
Code

Q-Learning

Benchmarks Add a Result

Libraries

Datasets

Latest papers

Content

Benchmarks

Add a Result