Dot-Product Attention

Introduced by Luong et al. in Effective Approaches to Attention-based Neural Machine Translation

Dot-Product Attention is an attention mechanism where the alignment score function is calculated as:

$$f_{att}\left(\textbf{h}_{i}, \textbf{s}_{j}\right) = h_{i}^{T}s_{j}$$

It is equivalent to multiplicative attention (without a trainable weight matrix, assuming this is instead an identity matrix). Here $\textbf{h}$ refers to the hidden states for the encoder, and $\textbf{s}$ is the hidden states for the decoder. The function above is thus a type of alignment score function.

Within a neural network, once we have the alignment scores, we calculate the final scores/weights using a softmax function of these alignment scores (ensuring it sums to 1).

Source: Effective Approaches to Attention-based Neural Machine Translation

Read Paper See Code

Papers

Paper	Code	Results	Date	Stars

Tasks

Task	Papers	Share
Image Generation	44	11.89%
Conditional Image Generation	17	4.59%
Semantic Segmentation	14	3.78%
Translation	12	3.24%
Image Classification	11	2.97%
Language Modelling	10	2.70%
Decision Making	8	2.16%
Super-Resolution	8	2.16%
Reinforcement Learning (RL)	8	2.16%

Usage Over Time

This feature is experimental; we are continuously improving our matching algorithm.

Components

Component	Type	Add Remove
🤖 No Components Found	You can add them if they exist; e.g. Mask R-CNN uses RoIAlign

Categories

Add Remove

Attention Mechanisms