Paper

Align-RUDDER: Learning From Few Demonstrations by Reward Redistribution

Reinforcement Learning algorithms require a large number of samples to solve complex tasks with sparse and delayed rewards. Complex tasks can often be hierarchically decomposed into sub-tasks... (read more)

Results in Papers With Code
(↓ scroll down to see all results)