Fine-grained Image Classification and Retrieval by Combining Visual and Locally Pooled Textual Features

14 Jan 2020Andres MaflaSounak DeyAli Furkan BitenLluis GomezDimosthenis Karatzas

Text contained in an image carries high-level semantics that can be exploited to achieve richer image understanding. In particular, the mere presence of text provides strong guiding content that should be employed to tackle a diversity of computer vision tasks such as image retrieval, fine-grained classification, and visual question answering... (read more)

PDF Abstract
TASK DATASET MODEL METRIC NAME METRIC VALUE GLOBAL RANK RESULT LEADERBOARD
Fine-Grained Image Classification Bottles PHOC descriptor + Fisher Vector Encoding mAP 77.4 # 1
Fine-Grained Image Classification Con-Text PHOC descriptor + Fisher Vector Encoding mAP 80.2 # 1

Methods used in the Paper


METHOD TYPE
🤖 No Methods Found Help the community by adding them if they're not listed; e.g. Deep Residual Learning for Image Recognition uses ResNet