Entropy Regularization is a type of regularization used in reinforcement learning. For on-policy policy gradient based methods like A3C, the same mutual reinforcement behaviour leads to a highly-peaked $\pi\left(a\mid{s}\right)$ towards a few actions or action sequences, since it is easier for the actor and critic to overoptimise to a small portion of the environment. To reduce this problem, entropy regularization adds an entropy term to the loss to promote action diversity:
$$H(X) = -\sum\pi\left(x\right)\log\left(\pi\left(x\right)\right) $$
Image Credit: Wikipedia
Source: Asynchronous Methods for Deep Reinforcement LearningPaper | Code | Results | Date | Stars |
---|
Task | Papers | Share |
---|---|---|
Reinforcement Learning (RL) | 152 | 18.31% |
Autonomous Driving | 110 | 13.25% |
Autonomous Vehicles | 41 | 4.94% |
Imitation Learning | 31 | 3.73% |
Decision Making | 28 | 3.37% |
Object Detection | 24 | 2.89% |
Semantic Segmentation | 20 | 2.41% |
Continuous Control | 17 | 2.05% |
Multi-agent Reinforcement Learning | 13 | 1.57% |
Component | Type |
|
---|---|---|
🤖 No Components Found | You can add them if they exist; e.g. Mask R-CNN uses RoIAlign |