Align-RUDDER: Learning From Few Demonstrations by Reward Redistribution

Reinforcement Learning algorithms require a large number of samples to solve complex tasks with sparse and delayed rewards. Complex tasks can often be hierarchically decomposed into sub-tasks... (read more)

PDF Abstract ICLR 2021 PDF (under review) ICLR 2021 Abstract (under review)

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods used in the Paper