$n$-step Returns are used for value function estimation in reinforcement learning. Specifically, for $n$ steps we can write the complete return as:
$$ R_{t}^{(n)} = r_{t+1} + \gamma{r}_{t+2} + \cdots + \gamma^{n-1}_{t+n} + \gamma^{n}V_{t}\left(s_{t+n}\right) $$
We can then write an $n$-step backup, in the style of TD learning, as:
$$ \Delta{V}_{r}\left(s_{t}\right) = \alpha\left[R_{t}^{(n)} - V_{t}\left(s_{t}\right)\right] $$
Multi-step returns often lead to faster learning with suitably tuned $n$.
Image Credit: Sutton and Barto, Reinforcement Learning
Paper | Code | Results | Date | Stars |
---|
Task | Papers | Share |
---|---|---|
Reinforcement Learning (RL) | 19 | 38.78% |
Atari Games | 6 | 12.24% |
Continuous Control | 4 | 8.16% |
OpenAI Gym | 3 | 6.12% |
Decision Making | 3 | 6.12% |
Distributional Reinforcement Learning | 2 | 4.08% |
Benchmarking | 2 | 4.08% |
Multi-agent Reinforcement Learning | 1 | 2.04% |
DQN Replay Dataset | 1 | 2.04% |
Component | Type |
|
---|---|---|
🤖 No Components Found | You can add them if they exist; e.g. Mask R-CNN uses RoIAlign |