The Monte Carlo Transformer: a stochastic self-attention model for sequence prediction

15 Jul 2020Alice MartinCharles OllionFlorian StrubSylvain Le CorffOlivier Pietquin

This paper introduces the Sequential Monte Carlo Transformer, an original approach that naturally captures the observations distribution in a recurrent architecture. The keys, queries, values and attention vectors of the network are considered as the unobserved stochastic states of its hidden structure... (read more)

PDF Abstract

Code


No code implementations yet. Submit your code now

Tasks


Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods used in the Paper