Many recent reinforcement learning (RL) methods learn stochastic policies with entropy regularization for exploration and robustness. However, in continuous action spaces, integrating entropy regularization with expressive policies is challenging and usually requires complex inference procedures... (read more)
PDFMETHOD | TYPE | |
---|---|---|
![]() |
Regularization |