Distillation Strategies for Proximal Policy Optimization

23 Jan 2019 Sam Green Craig M. Vineyard Çetin Kaya Koç

Vision-based deep reinforcement learning (RL) typically obtains performance benefit by using high capacity and relatively large convolutional neural networks (CNN). However, a large network leads to higher inference costs (power, latency, silicon area, MAC count)... (read more)

PDF Abstract
No code implementations yet. Submit your code now

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods used in the Paper


METHOD TYPE
Entropy Regularization
Regularization
PPO
Policy Gradient Methods
Dense Connections
Feedforward Networks
Convolution
Convolutions
Q-Learning
Off-Policy TD Control
DQN
Q-Learning Networks