Search Results for author: Hamid R. Maei

Found 2 papers, 0 papers with code

Convergent Temporal-Difference Learning with Arbitrary Smooth Function Approximation

no code implementations • NeurIPS 2009 • Shalabh Bhatnagar, Doina Precup, David Silver, Richard S. Sutton, Hamid R. Maei, Csaba Szepesvári

We introduce the first temporal-difference learning algorithms that converge with smooth value function approximators, such as neural networks.

Q-Learning

Paper
Add Code

A Convergent O(n) Temporal-difference Algorithm for Off-policy Learning with Linear Function Approximation

no code implementations • NeurIPS 2008 • Richard S. Sutton, Hamid R. Maei, Csaba Szepesvári

We introduce the first temporal-difference learning algorithm that is stable with linear function approximation and off-policy training, for any finite Markov decision process, target policy, and exciting behavior policy, and whose complexity scales linearly in the number of parameters.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.