3 dataset results for Semi Supervised Learning for Image Captioning AND Images

MS COCO (Microsoft Common Objects in Context)

The MS COCO (Microsoft Common Objects in Context) dataset is a large-scale object detection, segmentation, key-point detection, and captioning dataset. The dataset consists of 328K images.

10,363 PAPERS • 93 BENCHMARKS

Flickr30k

The Flickr30k dataset contains 31,000 images collected from Flickr, together with 5 reference sentences provided by human annotators.

754 PAPERS • 9 BENCHMARKS

FlickrStyle10K

FlickrStyle10K is collected and built on Flickr30K image caption dataset. The original FlickrStyle10K dataset has 10,000 pairs of images and stylized captions including humorous and romantic styles. However, only 7,000 pairs from the ofﬁcial training set are now publicly accessible. The dataset can be downloaded via https://zhegan27.github.io/Papers/FlickrStyle_v0.9.zip

23 PAPERS • 2 BENCHMARKS

Datasets

3 dataset results for Semi Supervised Learning for Image Captioning AND Images