Methodology

Q-Learning

388 papers with code • 0 benchmarks • 2 datasets

The goal of Q-learning is to learn a policy, which tells an agent what action to take under what circumstances.

( Image credit: Playing Atari with Deep Reinforcement Learning )

Benchmarks

Add a Result

These leaderboards are used to track progress in Q-Learning

No evaluation results yet. Help compare methods by submitting evaluation metrics.

Libraries

Use these libraries to find Q-Learning models and implementations

opendilab/DI-engine

6 papers

2,548

zzmtsvv/rl_task

6 papers

hill-a/stable-baselines

5 papers

4,042

toni-sm/skrl

5 papers

403

See all 29 libraries.

Datasets

Latest papers with no code

Most implemented Social Latest No code

Unified ODE Analysis of Smooth Q-Learning Algorithms

no code yet • 20 Apr 2024

This approach applies the so-called ordinary differential equation (ODE) approach to prove the convergence of the asynchronous Q-learning modeled as a continuous-time switching system, where notions from switching system theory are used to prove its asymptotic stability without using explicit Lyapunov arguments.

Paper
Add Code

Continuous-time Risk-sensitive Reinforcement Learning via Quadratic Variation Penalty

no code yet • 19 Apr 2024

Owing to the martingale perspective in Jia and Zhou (2023) the risk-sensitive RL problem is shown to be equivalent to ensuring the martingale property of a process involving both the value function and the q-function, augmented by an additional penalty term: the quadratic variation of the value process, capturing the variability of the value-to-go along the trajectory.

Paper
Add Code

From $r$ to $Q^*$: Your Language Model is Secretly a Q-Function

no code yet • 18 Apr 2024

Standard RLHF deploys reinforcement learning in a specific token-level MDP, while DPO is derived as a bandit problem in which the whole response of the model is treated as a single arm.

Paper
Add Code

Empowering Embodied Visual Tracking with Visual Foundation Models and Offline RL

no code yet • 15 Apr 2024

We evaluate our tracker on several high-fidelity environments with challenging situations, such as distraction and occlusion.

Paper
Add Code

Advancing Forest Fire Prevention: Deep Reinforcement Learning for Effective Firebreak Placement

no code yet • 12 Apr 2024

To the best of our knowledge, this study represents a pioneering effort in using Reinforcement Learning to address the aforementioned problem, offering promising perspectives in fire prevention and landscape management

Paper
Add Code

Prelimit Coupling and Steady-State Convergence of Constant-stepsize Nonsmooth Contractive SA

no code yet • 9 Apr 2024

Motivated by Q-learning, we study nonsmooth contractive stochastic approximation (SA) with constant stepsize.

Paper
Add Code

Traffic Signal Control and Speed Offset Coordination Using Q-Learning for Arterial Road Networks

no code yet • 9 Apr 2024

We evaluate the performance of the proposed arterial traffic control strategy using microscopic traffic simulations of an arterial corridor with seven intersections near the I-710 freeway.

Paper
Add Code

Deep Reinforcement Learning Control for Disturbance Rejection in a Nonlinear Dynamic System with Parametric Uncertainty

no code yet • 6 Apr 2024

This work describes a technique for active rejection of multiple independent and time-correlated stochastic disturbances for a nonlinear flexible inverted pendulum with cart system with uncertain model parameters.

Paper
Add Code

Growing Q-Networks: Solving Continuous Control Tasks with Adaptive Control Resolution

no code yet • 5 Apr 2024

Recent reinforcement learning approaches have shown surprisingly strong capabilities of bang-bang policies for solving continuous control benchmarks.

Paper
Add Code

Utilizing Maximum Mean Discrepancy Barycenter for Propagating the Uncertainty of Value Functions in Reinforcement Learning

no code yet • 31 Mar 2024

Accounting for the uncertainty of value functions boosts exploration in Reinforcement Learning (RL).

Paper
Add Code

Q-Learning

Benchmarks Add a Result

Libraries

Datasets

Latest papers with no code

Content

Benchmarks

Add a Result