ImageNet-Hard

ImageNet-Hard is a new benchmark that comprises 10,980 images collected from various existing ImageNet-scale benchmarks (ImageNet, ImageNet-V2, ImageNet-Sketch, ImageNet-C, ImageNet-R, ImageNet-ReaL, ImageNet-A, and ObjectNet). This dataset poses a significant challenge to state-of-the-art vision models as merely zooming in often fails to improve their ability to classify images correctly. As a result, even the most advanced models, such as CLIP-ViT-L/14@336px, struggle to perform well on this dataset, achieving a mere 2.02% accuracy.

Papers


Paper Code Results Date Stars

Dataset Loaders


No data loaders found. You can submit your data loader here.

Tasks


Similar Datasets


License


  • mit

Modalities


Languages