We propose a split built on top of Stanford GQA dataset originally proposed for VQA and name it Compositional GQA (C-GQA) dataset (see supplementary for the details). CGQA contains over 9.5k compositional labels making it the most extensive dataset for CZSL. With cleaner labels and a larger label space, our hope is that this dataset will inspire further research on the topic.
27 PAPERS • NO BENCHMARKS YET
ReaSCAN is a synthetic navigation task that requires models to reason about surroundings over syntactically difficult languages.
7 PAPERS • NO BENCHMARKS YET
UT Zappos50K (UT-Zap50K) is a large shoe dataset consisting of 50,025 catalog images collected from Zappos.com. The images are divided into 4 major categories — shoes, sandals, slippers, and boots — followed by functional types and individual brands. The shoes are centered on a white background and pictured in the same orientation for convenient analysis. This dataset is created in the context of an online shopping task, where users pay special attentions to fine-grained visual differences. For instance, it is more likely that a shopper is deciding between two pairs of similar men's running shoes instead of between a woman's high heel and a man's slipper. GIST and LAB color features are provided. In addition, each image has 8 associated meta-data (gender, materials, etc.) labels that are used to filter the shoes on Zappos.com. We introduced this dataset in the context of a pairwise comparison task, where the goal is to predict which of two images more strongly exhibits a visual attribu
0 PAPER • NO BENCHMARKS YET