no code implementations • 8 Feb 2024 • Jonathan Thomm, Aleksandar Terzic, Geethan Karunaratne, Giacomo Camposampiero, Bernhard Schölkopf, Abbas Rahimi
We analyze the capabilities of Transformer language models on learning discrete algorithms.
no code implementations • 9 Dec 2023 • Aleksandar Terzic, Michael Hersche, Geethan Karunaratne, Luca Benini, Abu Sebastian, Abbas Rahimi
We build upon their approach by replacing the linear recurrence with a special temporal convolutional network which permits larger receptive field size with shallower networks, and reduces the computational complexity to $O(L)$.
no code implementations • 24 Mar 2023 • Michael Hersche, Aleksandar Terzic, Geethan Karunaratne, Jovin Langenegger, Angéline Pouget, Giovanni Cherubini, Luca Benini, Abu Sebastian, Abbas Rahimi
We provide a methodology to flexibly integrate our factorizer in the classification layer of CNNs with a novel loss function.