The Many Faces of Robustness: A Critical Analysis of Out-of-Distribution Generalization

29 Jun 2020Dan HendrycksSteven BasartNorman MuSaurav KadavathFrank WangEvan DorundoRahul DesaiTyler ZhuSamyak ParajuliMike GuoDawn SongJacob SteinhardtJustin Gilmer

We introduce three new robustness benchmarks consisting of naturally occurring distribution changes in image style, geographic location, camera operation, and more. Using our benchmarks, we take stock of previously proposed hypotheses for out-of-distribution robustness and put them to the test... (read more)

PDF Abstract

Results from the Paper


TASK DATASET MODEL METRIC NAME METRIC VALUE GLOBAL RANK RESULT BENCHMARK
Domain Generalization ImageNet-C DeepAugment (ResNet-50) mean Corruption Error (mCE) 60.4 # 1
Domain Generalization ImageNet-R DeepAugment+AugMix (ResNet-50) Top-1 Error Rate 53.2 # 1
Domain Generalization ImageNet-R DeepAugment (ResNet-50) Top-1 Error Rate 57.8 # 2

Methods used in the Paper


METHOD TYPE
🤖 No Methods Found Help the community by adding them if they're not listed; e.g. Deep Residual Learning for Image Recognition uses ResNet