2 code implementations • CVPR 2022 • Yong liu, Siqi Mai, Xiangning Chen, Cho-Jui Hsieh, Yang You
Recently, Sharpness-Aware Minimization (SAM), which connects the geometry of the loss landscape and generalization, has demonstrated significant performance boosts on training large-scale models such as vision transformers.
no code implementations • 29 Sep 2021 • Yong liu, Siqi Mai, Xiangning Chen, Cho-Jui Hsieh, Yang You
Large-batch training is an important direction for distributed machine learning, which can improve the utilization of large-scale clusters and therefore accelerate the training process.