Value Function Estimation

N-step Returns

$n$-step Returns are used for value function estimation in reinforcement learning. Specifically, for $n$ steps we can write the complete return as:

$$ R_{t}^{(n)} = r_{t+1} + \gamma{r}_{t+2} + \cdots + \gamma^{n-1}_{t+n} + \gamma^{n}V_{t}\left(s_{t+n}\right) $$

We can then write an $n$-step backup, in the style of TD learning, as:

$$ \Delta{V}_{r}\left(s_{t}\right) = \alpha\left[R_{t}^{(n)} - V_{t}\left(s_{t}\right)\right] $$

Multi-step returns often lead to faster learning with suitably tuned $n$.

Image Credit: Sutton and Barto, Reinforcement Learning

Papers


Paper Code Results Date Stars

Components


Component Type
🤖 No Components Found You can add them if they exist; e.g. Mask R-CNN uses RoIAlign

Categories