Search Results for author: Aman Shukla

Found 1 papers, 1 papers with code

A Tale of Two Circuits: Grokking as Competition of Sparse and Dense Subnetworks

1 code implementation21 Mar 2023 William Merrill, Nikolaos Tsilivis, Aman Shukla

Grokking is a phenomenon where a model trained on an algorithmic task first overfits but, then, after a large amount of additional training, undergoes a phase transition to generalize perfectly.

Cannot find the paper you are looking for? You can Submit a new open access paper.