Search Results for author: Daniele Coppola

Found 1 papers, 0 papers with code

Rethinking Attention: Exploring Shallow Feed-Forward Neural Networks as an Alternative to Attention Layers in Transformers

no code implementations • 17 Nov 2023 • Vukasin Bozic, Danilo Dordevic, Daniele Coppola, Joseph Thommes, Sidak Pal Singh

This work presents an analysis of the effectiveness of using standard shallow feed-forward networks to mimic the behavior of the attention mechanism in the original Transformer model, a state-of-the-art architecture for sequence-to-sequence tasks.

Knowledge Distillation

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.