no code implementations • 19 Apr 2024 • Longfei Huang, Shupeng Zhong, Xiangyu Wu, Ruoxuan Li
Subsequently, we propose caption-level strategy for the high-quality caption data generated by the image caption models and integrate them with retrieval augmentation strategy into the template to compel the model to generate higher quality, more matching, and semantically enriched captions based on the retrieval augmentation prompts.