Search Results for author: Hamid R. Maei

Found 2 papers, 0 papers with code

Convergent Temporal-Difference Learning with Arbitrary Smooth Function Approximation

no code implementations NeurIPS 2009 Shalabh Bhatnagar, Doina Precup, David Silver, Richard S. Sutton, Hamid R. Maei, Csaba Szepesvári

We introduce the first temporal-difference learning algorithms that converge with smooth value function approximators, such as neural networks.

Q-Learning

A Convergent O(n) Temporal-difference Algorithm for Off-policy Learning with Linear Function Approximation

no code implementations NeurIPS 2008 Richard S. Sutton, Hamid R. Maei, Csaba Szepesvári

We introduce the first temporal-difference learning algorithm that is stable with linear function approximation and off-policy training, for any finite Markov decision process, target policy, and exciting behavior policy, and whose complexity scales linearly in the number of parameters.

Cannot find the paper you are looking for? You can Submit a new open access paper.