1 code implementation • 5 Nov 2016 • Frank S. He, Yang Liu, Alexander G. Schwing, Jian Peng
We propose a novel training algorithm for reinforcement learning which combines the strength of deep Q-learning with a constrained optimization approach to tighten optimality and encourage faster reward propagation.