no code implementations • 2 Apr 2024 • Lena Strobl, Dana Angluin, David Chiang, Jonathan Rawski, Ashish Sabharwal
We study the sequence-to-sequence mapping capacity of transformers by relating them to finite transducers, and find that they can express surprisingly large classes of transductions.
no code implementations • 1 Nov 2023 • Lena Strobl, William Merrill, Gail Weiss, David Chiang, Dana Angluin
As transformers have gained prominence in natural language processing, some researchers have investigated theoretically what problems they can and cannot solve, by treating problems as formal languages.
no code implementations • 21 Oct 2023 • Dana Angluin, David Chiang, Andy Yang
We consider transformer encoders with hard attention (in which all attention is focused on exactly one position) and strict future masking (in which each position only attends to positions strictly to its left), and prove that the class of languages recognized by these networks is exactly the star-free languages.
no code implementations • 13 Apr 2022 • Yiding Hao, Dana Angluin, Robert Frank
This paper analyzes three formal models of Transformer encoders that differ in the form of their self-attention mechanism: unique hard attention (UHAT); generalized unique hard attention (GUHAT), which generalizes UHAT; and averaging hard attention (AHAT).
no code implementations • 10 Sep 2018 • Dana Angluin, Dana Fisman
The right congruence of a regular omega-language is not informative enough; many regular omega-languages have a trivial right congruence, and in general it is not always possible to define an omega-automaton recognizing a given language that is isomorphic to the rightcon automaton.
2 code implementations • WS 2018 • Yiding Hao, William Merrill, Dana Angluin, Robert Frank, Noah Amsel, Andrew Benz, Simon Mendelsohn
This paper analyzes the behavior of stack-augmented recurrent neural network (RNN) models.