Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation

In this work, we propose to apply trust region optimization to deep reinforcement learning using a recently proposed Kronecker-factored approximation to the curvature. We extend the framework of natural policy gradient and propose to optimize both the actor and the critic using Kronecker-factored approximate curvature (K-FAC) with trust region; hence we call our method Actor Critic using Kronecker-Factored Trust Region (ACKTR)... (read more)

PDF Abstract NeurIPS 2017 PDF NeurIPS 2017 Abstract

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods used in the Paper


METHOD TYPE
Entropy Regularization
Regularization
Dense Connections
Feedforward Networks
Convolution
Convolutions
ACTKR
Policy Gradient Methods