A Call to Reflect on Evaluation Practices for Age Estimation: Comparative Analysis of the State-of-the-Art and a Unified Benchmark
Comparing different age estimation methods poses a challenge due to the unreliability of published results stemming from inconsistencies in the benchmarking process. Previous studies have reported continuous performance improvements over the past decade using specialized methods; however, our findings challenge these claims. This paper identifies two trivial, yet persistent issues with the currently used evaluation protocol and describes how to resolve them. We offer an extensive comparative analysis for state-of-the-art facial age estimation methods. Surprisingly, we find that the performance differences between the methods are negligible compared to the effect of other factors, such as facial alignment, facial coverage, image resolution, model architecture, or the amount of data used for pretraining. We use the gained insights to propose using FaRL as the backbone model and demonstrate its effectiveness on all public datasets. We make the source code and exact data splits public on GitHub.
PDF AbstractResults from the Paper
Ranked #1 on Age Estimation on ChaLearn 2016 (MAE metric, using extra training data)
Task | Dataset | Model | Metric Name | Metric Value | Global Rank | Uses Extra Training Data |
Benchmark |
---|---|---|---|---|---|---|---|
Age Estimation | AFAD | ResNet-50-SORD | MAE | 3.14 | # 7 | ||
Age Estimation | AFAD | ResNet-50-Mean-Variance | MAE | 3.16 | # 4 | ||
Age Estimation | AFAD | ResNet-50-Unimodal-Concentrated | MAE | 3.20 | # 2 | ||
Age Estimation | AFAD | ResNet-50-Regression | MAE | 3.17 | # 3 | ||
Age Estimation | AFAD | FaRL+MLP | MAE | 3.12 | # 10 | ||
Age Estimation | AFAD | ResNet-50-Cross-Entropy | MAE | 3.14 | # 7 | ||
Age Estimation | AFAD | ResNet-50-OR-CNN | MAE | 3.16 | # 4 | ||
Age Estimation | AFAD | ResNet-50-DLDL | MAE | 3.14 | # 7 | ||
Age Estimation | AFAD | ResNet-50-DLDL-v2 | MAE | 3.15 | # 6 | ||
Age Estimation | AgeDB | ResNet-50-SORD | MAE | 5.81 | # 6 | ||
Age Estimation | AgeDB | ResNet-50-Cross-Entropy | MAE | 5.81 | # 6 | ||
Age Estimation | AgeDB | ResNet-50-DLDL-v2 | MAE | 5.80 | # 4 | ||
Age Estimation | AgeDB | ResNet-50-DLDL | MAE | 5.80 | # 4 | ||
Age Estimation | AgeDB | FaRL+MLP | MAE | 5.64 | # 2 | ||
Age Estimation | AgeDB | ResNet-50-OR-CNN | MAE | 5.78 | # 3 | ||
Age Estimation | AgeDB | ResNet-50-Regression | MAE | 6.23 | # 10 | ||
Age Estimation | AgeDB | ResNet-50-Unimodal-Concentrated | MAE | 5.90 | # 9 | ||
Age Estimation | AgeDB | ResNet-50-Mean-Variance | MAE | 5.85 | # 8 | ||
Age Estimation | CACD | ResNet-50-Regression | MAE | 4.06 | # 8 | ||
Age Estimation | CACD | FaRL+MLP | MAE | 3.96 | # 2 | ||
Age Estimation | CACD | ResNet-50-SORD | MAE | 3.96 | # 2 | ||
Age Estimation | CACD | ResNet-50-Unimodal-Concentrated | MAE | 4.10 | # 10 | ||
Age Estimation | CACD | ResNet-50-Mean-Variance | MAE | 4.07 | # 9 | ||
Age Estimation | CACD | ResNet-50-DLDL-v2 | MAE | 3.96 | # 2 | ||
Age Estimation | CACD | ResNet-50-DLDL | MAE | 3.96 | # 2 | ||
Age Estimation | CACD | ResNet-50-OR-CNN | MAE | 4.01 | # 7 | ||
Age Estimation | CACD | ResNet-50-Cross-Entropy | MAE | 3.96 | # 2 | ||
Age Estimation | ChaLearn 2016 | FaRL+MLP | MAE | 3.38 | # 1 | ||
Age Estimation | MORPH Album2 (SE) | ResNet-50-Cross-Entropy | MAE | 2.81 | # 2 | ||
Age Estimation | MORPH Album2 (SE) | ResNet-50-OR-CNN | MAE | 2.83 | # 6 | ||
Age Estimation | MORPH Album2 (SE) | FaRL+MLP | MAE | 3.04 | # 9 | ||
Age Estimation | MORPH Album2 (SE) | ResNet-50-Regression | MAE | 2.83 | # 6 | ||
Age Estimation | MORPH Album2 (SE) | ResNet-50-Unimodal-Concentrated | MAE | 2.78 | # 1 | ||
Age Estimation | MORPH Album2 (SE) | ResNet-50-Mean-Variance | MAE | 2.83 | # 6 | ||
Age Estimation | MORPH Album2 (SE) | ResNet-50-SORD | MAE | 2.81 | # 2 | ||
Age Estimation | MORPH Album2 (SE) | ResNet-50-DLDL-v2 | MAE | 2.82 | # 5 | ||
Age Estimation | MORPH Album2 (SE) | ResNet-50-DLDL | MAE | 2.81 | # 2 | ||
Age Estimation | UTKFace | ResNet-50-Cross-Entropy | MAE | 4.38 | # 6 | ||
Age Estimation | UTKFace | ResNet-50-Regression | MAE | 4.72 | # 13 | ||
Age Estimation | UTKFace | ResNet-50-OR-CNN | MAE | 4.40 | # 8 | ||
Age Estimation | UTKFace | ResNet-50-DLDL | MAE | 4.39 | # 7 | ||
Age Estimation | UTKFace | ResNet-50-DLDL-v2 | MAE | 4.42 | # 9 | ||
Age Estimation | UTKFace | ResNet-50-SORD | MAE | 4.36 | # 4 | ||
Age Estimation | UTKFace | ResNet-50-Mean-Variance | MAE | 4.42 | # 9 | ||
Age Estimation | UTKFace | ResNet-50-Unimodal-Concentrated | MAE | 4.47 | # 11 | ||
Age Estimation | UTKFace | FaRL+MLP | MAE | 3.87 | # 2 |