Evaluation of Output Embeddings for Fine-Grained Image Classification

Image classification has advanced significantly in recent years with the availability of large-scale image sets. However, fine-grained classification remains a major challenge due to the annotation cost of large numbers of fine-grained categories. This project shows that compelling classification performance can be achieved on such categories even without labeled training data. Given image and class embeddings, we learn a compatibility function such that matching embeddings are assigned a higher score than mismatching ones; zero-shot classification of an image proceeds by finding the label yielding the highest joint compatibility score. We use state-of-the-art image features and focus on different supervised attributes and unsupervised output embeddings either derived from hierarchies or learned from unlabeled text corpora. We establish a substantially improved state-of-the-art on the Animals with Attributes and Caltech-UCSD Birds datasets. Most encouragingly, we demonstrate that purely unsupervised output embeddings (learned from Wikipedia and improved with fine-grained text) achieve compelling results, even outperforming the previous supervised state-of-the-art. By combining different output embeddings, we further improve results.

PDF Abstract CVPR 2015 PDF CVPR 2015 Abstract
Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Few-Shot Image Classification CUB-200 - 0-Shot Learning SJE Accuracy 50.1% # 2
Few-Shot Image Classification CUB-200-2011 - 0-Shot SJE Top-1 Accuracy 50.1% # 3
Few-Shot Image Classification CUB 200 50-way (0-shot) SJE Akata et al. (2015) Accuracy 50.1 # 4
Zero-Shot Action Recognition HMDB51 SJE(word embedding) Top-1 Accuracy 13.3 # 26
Zero-Shot Action Recognition Kinetics SJE(Word Embedding) Top-1 Accuracy 22.3 # 17
Top-5 Accuracy 48.2 # 15
Zero-Shot Action Recognition Olympics SJE(Atrribute) Top-1 Accuracy 47.5 # 6
Zero-Shot Action Recognition Olympics SJE(Word Embedding) Top-1 Accuracy 28.6 # 9
Zero-Shot Action Recognition UCF101 SJE(Attribute) Top-1 Accuracy 12.0 # 30
Zero-Shot Action Recognition UCF101 SJE(Word Embedding) Top-1 Accuracy 9.9 # 32

Methods


No methods listed for this paper. Add relevant methods here