no code implementations • 23 Jun 2023 • Pascal Jr. Tikeng Notsawo, Hattie Zhou, Mohammad Pezeshki, Irina Rish, Guillaume Dumas
In essence, by studying the learning curve of the first few epochs, we show that one can predict whether grokking will occur later on.