Deep $k$-NN Label Smoothing Improves Reproducibility of Neural Network Predictions

1 Jan 2021  ·  Dara Bahri, Heinrich Jiang ·

Training modern neural networks is an inherently noisy process that can lead to high \emph{prediction churn}-- disagreements between re-trainings of the same model due to factors such as randomization in the parameter initialization and mini-batches-- even when the trained models all attain high accuracies. Such prediction churn can be very undesirable in practice. In this paper, we present several baselines for reducing churn and show that utilizing the $k$-NN predictions to smooth the labels results in a new and principled method that outperforms the baselines on churn while improving accuracy on a variety of benchmark classification tasks and model architectures.

PDF Abstract
No code implementations yet. Submit your code now

Tasks


Datasets


Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here