Tune It or Don't Use It: Benchmarking Data-Efficient Image Classification

30 Aug 2021  ·  Lorenzo Brigato, Björn Barz, Luca Iocchi, Joachim Denzler ·

Data-efficient image classification using deep neural networks in settings, where only small amounts of labeled data are available, has been an active research area in the recent past. However, an objective comparison between published methods is difficult, since existing works use different datasets for evaluation and often compare against untuned baselines with default hyper-parameters. We design a benchmark for data-efficient image classification consisting of six diverse datasets spanning various domains (e.g., natural images, medical imagery, satellite data) and data types (RGB, grayscale, multispectral). Using this benchmark, we re-evaluate the standard cross-entropy baseline and eight methods for data-efficient deep learning published between 2017 and 2021 at renowned venues. For a fair and realistic comparison, we carefully tune the hyper-parameters of all methods on each dataset. Surprisingly, we find that tuning learning rate, weight decay, and batch size on a separate validation split results in a highly competitive baseline, which outperforms all but one specialized method and performs competitively to the remaining one.

PDF Abstract
Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Small Data Image Classification ciFAIR-10 50 samples per class Cross-entropy baseline Accuracy 58.22 # 3
Small Data Image Classification ciFAIR-10 50 samples per class Harmonic Networks Accuracy 56.50 # 5
Small Data Image Classification ciFAIR-10 50 samples per class T-vMF Similarity Accuracy 57.50 # 4
Small Data Image Classification CUB-200-2011, 30 samples per class DSK Networks (no pre-training) Accuracy 71.02 # 4
Small Data Image Classification CUB-200-2011, 30 samples per class Cross-entropy baseline (no pre-training) Accuracy 71.44 # 3
Small Data Image Classification CUB-200-2011, 30 samples per class Harmonic Networks (no pre-training) Accuracy 72.26 # 2
Small Data Image Classification DEIC Benchmark Grad-l2 Penalty Average Balanced Accuracy (across datasets) 55.47 # 10
Small Data Image Classification DEIC Benchmark Cosine + Cross-Entropy Loss Average Balanced Accuracy (across datasets) 64.92 # 3
Small Data Image Classification DEIC Benchmark Harmonic Networks Average Balanced Accuracy (across datasets) 68.70 # 1
Small Data Image Classification DEIC Benchmark Full Convolution Average Balanced Accuracy (across datasets) 62.06 # 8
Small Data Image Classification DEIC Benchmark Cosine Loss Average Balanced Accuracy (across datasets) 62.73 # 7
Small Data Image Classification DEIC Benchmark OLÉ Average Balanced Accuracy (across datasets) 64.15 # 6
Small Data Image Classification DEIC Benchmark DSK Networks Average Balanced Accuracy (across datasets) 64.64 # 5
Small Data Image Classification DEIC Benchmark T-vMF Similarity Average Balanced Accuracy (across datasets) 64.67 # 4
Small Data Image Classification DEIC Benchmark Cross-Entropy baseline Average Balanced Accuracy (across datasets) 67.90 # 2
Small Data Image Classification DEIC Benchmark Deep Hybrid Networks Average Balanced Accuracy (across datasets) 60.33 # 9
Small Data Image Classification EuroSAT 50 samples per class Harmonic Networks Accuracy 92.09 # 1
Small Data Image Classification EuroSAT 50 samples per class DSK Networks Accuracy 91.25 # 2
Small Data Image Classification EuroSAT 50 samples per class Deep Hybrid Networks Accuracy 91.15 # 3
Small Data Image Classification ImageNet 50 samples per class Cross-entropy baseline 1:1 Accuracy 44.97 # 3
Small Data Image Classification ImageNet 50 samples per class Harmonic Networks 1:1 Accuracy 46.36 # 1
Small Data Image Classification ImageNet 50 samples per class DSK Networks 1:1 Accuracy 45.21 # 2

Methods


No methods listed for this paper. Add relevant methods here