TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Image Captioning	COCO Captions	ExpansionNet v2 (No VL pretraining)	BLEU-4	42.7	# 6
Image Captioning	COCO Captions	ExpansionNet v2 (No VL pretraining)	METEOR	30.6	# 11
Image Captioning	COCO Captions	ExpansionNet v2 (No VL pretraining)	ROUGE-L	61.1	# 1
Image Captioning	COCO Captions	ExpansionNet v2 (No VL pretraining)	CIDER	143.7	# 11
Image Captioning	COCO Captions	ExpansionNet v2 (No VL pretraining)	SPICE	24.7	# 10
Image Captioning	COCO Captions	ExpansionNet v2 (No VL pretraining)	BLEU-1	83.5	# 2
Image Captioning	MS COCO	ExpansionNet v2	CIDEr	143.7	# 1

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/expansionnet-v2-block-static-expansion-in/image-captioning-on-coco)](https://paperswithcode.com/sota/image-captioning-on-coco?p=expansionnet-v2-block-static-expansion-in)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/expansionnet-v2-block-static-expansion-in/image-captioning-on-coco-captions)](https://paperswithcode.com/sota/image-captioning-on-coco-captions?p=expansionnet-v2-block-static-expansion-in)`

Exploiting Multiple Sequence Lengths in Fast End to End Training for Image Captioning

13 Aug 2022 · Jia Cheng Hu, Roberto Cavicchioli, Alessandro Capotondi ·

We introduce a method called the Expansion mechanism that processes the input unconstrained by the number of elements in the sequence. By doing so, the model can learn more effectively compared to traditional attention-based approaches. To support this claim, we design a novel architecture ExpansionNet v2 that achieved strong results on the MS COCO 2014 Image Captioning challenge and the State of the Art in its respective category, with a score of 143.7 CIDErD in the offline test split, 140.8 CIDErD in the online evaluation server and 72.9 AllCIDEr on the nocaps validation set. Additionally, we introduce an End to End training algorithm up to 2.8 times faster than established alternatives. Source code available at: https://github.com/jchenghu/ExpansionNet_v2

PDF Abstract

Code

Add Remove Mark official

jchenghu/expansionnet_v2 official

Tasks

Add Remove

Image Captioning

Datasets

MS COCO

COCO Captions

NoCaps

CC12M

Results from the Paper

Edit

Ranked #1 on Image Captioning on MS COCO

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Image Captioning	COCO Captions	ExpansionNet v2 (No VL pretraining)	BLEU-4	42.7	# 6	Compare
			METEOR	30.6	# 11	Compare
			ROUGE-L	61.1	# 1	Compare
			CIDER	143.7	# 11	Compare
			SPICE	24.7	# 10	Compare
			BLEU-1	83.5	# 2	Compare
Image Captioning	MS COCO	ExpansionNet v2	CIDEr	143.7	# 1	Compare

Methods

Add Remove

Test

Edit Social Preview

Exploiting Multiple Sequence Lengths in Fast End to End Training for Image Captioning

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove