Parameterized MDPs and Reinforcement Learning Problems -- A Maximum Entropy Principle Based Framework

We present a framework to address a class of sequential decision making problems. Our framework features learning the optimal control policy with robustness to noisy data, determining the unknown state and action parameters, and performing sensitivity analysis with respect to problem parameters... (read more)

PDF Abstract
No code implementations yet. Submit your code now


  Add Datasets introduced or used in this paper

Results from the Paper

  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods used in the Paper

Double Q-learning
Off-Policy TD Control
Off-Policy TD Control