Search Results for author: Sadegh Mahdavi

Found 2 papers, 2 papers with code

Memorization Capacity of Multi-Head Attention in Transformers

1 code implementation • 3 Jun 2023 • Sadegh Mahdavi, Renjie Liao, Christos Thrampoulidis

Transformers have become the go-to architecture for language and vision tasks, yet their theoretical properties, especially memorization capacity, remain elusive.

Image Classification Memorization +1

Paper
Code

Towards Better Out-of-Distribution Generalization of Neural Algorithmic Reasoning Tasks

1 code implementation • 1 Nov 2022 • Sadegh Mahdavi, Kevin Swersky, Thomas Kipf, Milad Hashemi, Christos Thrampoulidis, Renjie Liao

In this paper, we study the OOD generalization of neural algorithmic reasoning tasks, where the goal is to learn an algorithm (e. g., sorting, breadth-first search, and depth-first search) from input-output pairs using deep neural networks.

Data Augmentation Out-of-Distribution Generalization

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.