no code implementations • 5 Apr 2024 • Andy Yang, David Chiang
Deriving formal bounds on the expressivity of transformers, as well as studying transformers that are constructed to implement known algorithms, are both effective methods for better understanding the computational power of transformers.
no code implementations • 21 Oct 2023 • Dana Angluin, David Chiang, Andy Yang
We consider transformer encoders with hard attention (in which all attention is focused on exactly one position) and strict future masking (in which each position only attends to positions strictly to its left), and prove that the class of languages recognized by these networks is exactly the star-free languages.