STAIR Captions is a large-scale dataset containing 820,310 Japanese captions. This dataset can be used for caption generation, multimodal retrieval, and image generation.
4 PAPERS • NO BENCHMARKS YET