Ape-X is a distributed architecture for deep reinforcement learning. The algorithm decouples acting from learning: the actors interact with their own instances of the environment by selecting actions according to a shared neural network, and accumulate the resulting experience in a shared experience replay memory; the learner replays samples of experience and updates the neural network. The architecture relies on prioritized experience replay to focus only on the most significant data generated by the actors.
In contrast to Gorila, Ape-X uses a shared, centralized replay memory, and instead of sampling uniformly, it prioritizes, to sample the most useful data more often. All communications are batched with the centralized replay, increasing the efficiency and throughput at the cost of some latency. And by learning off-policy, Ape-X has the ability to combine data from many distributed actors, by giving the different actors different exploration policies, broadening the diversity of the experience they jointly encounter.
Source:PAPER | DATE |
---|---|
Recurrent Distributed Reinforcement Learning for Partially Observable Robotic Assembly
• |
2020-10-15 |
Dynamic Experience Replay
• |
2020-03-04 |
Google Research Football: A Novel Reinforcement Learning Environment
|
2019-07-25 |
Macro action selection with deep reinforcement learning in StarCraft
|
2018-12-02 |
An Intriguing Failing of Convolutional Neural Networks and the CoordConv Solution
|
2018-07-09 |
Deep Curiosity Search: Intra-Life Exploration Can Improve Performance on Challenging Deep Reinforcement Learning Problems
• |
2018-06-01 |
Distributed Prioritized Experience Replay
|
2018-03-02 |
TASK | PAPERS | SHARE |
---|---|---|
Atari Games | 2 | 28.57% |
Game of Football | 1 | 14.29% |
Real-Time Strategy Games | 1 | 14.29% |
Starcraft | 1 | 14.29% |
Image Classification | 1 | 14.29% |
Montezuma's Revenge | 1 | 14.29% |
COMPONENT | TYPE |
|
---|---|---|
![]() |
Replay Memory |