Value Function Estimation

Stochastic Dueling Network

Introduced by Wang et al. in Sample Efficient Actor-Critic with Experience Replay

A Stochastic Dueling Network, or SDN, is an architecture for learning a value function $V$. The SDN learns both $V$ and $Q$ off-policy while maintaining consistency between the two estimates. At each time step it outputs a stochastic estimate of $Q$ and a deterministic estimate of $V$.

Source: Sample Efficient Actor-Critic with Experience Replay

Papers


Paper Code Results Date Stars

Components


Component Type
🤖 No Components Found You can add them if they exist; e.g. Mask R-CNN uses RoIAlign

Categories