no code implementations • 6 Nov 2023 • Wonho Bae, Yi Ren, Mohamad Osama Ahmed, Frederick Tung, Danica J. Sutherland, Gabriel L. Oliveira
Although neural networks are conventionally optimized towards zero training loss, it has been recently learned that targeting a non-zero training loss threshold, referred to as a flood level, often enables better test time generalization.