TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK	REMOVE
Visual Question Answering	VQA v2 test-dev	LXMERT (low-magnitude pruning)	Accuracy	70.72	# 8
Visual Question Answering	VQA v2 test-std	LXMERT (low-magnitude pruning)	Accuracy	70.87	# 2

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/lxmert-model-compression-for-visual-question/visual-question-answering-on-vqa-v2-test-std-1)](https://paperswithcode.com/sota/visual-question-answering-on-vqa-v2-test-std-1?p=lxmert-model-compression-for-visual-question)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/lxmert-model-compression-for-visual-question/visual-question-answering-on-vqa-v2-test-dev-1)](https://paperswithcode.com/sota/visual-question-answering-on-vqa-v2-test-dev-1?p=lxmert-model-compression-for-visual-question)`

LXMERT Model Compression for Visual Question Answering

23 Oct 2023 · Maryam Hashemi, Ghazaleh Mahmoudi, Sara Kodeiri, Hadi Sheikhi, Sauleh Eetemadi ·

Large-scale pretrained models such as LXMERT are becoming popular for learning cross-modal representations on text-image pairs for vision-language tasks. According to the lottery ticket hypothesis, NLP and computer vision models contain smaller subnetworks capable of being trained in isolation to full performance. In this paper, we combine these observations to evaluate whether such trainable subnetworks exist in LXMERT when fine-tuned on the VQA task. In addition, we perform a model size cost-benefit analysis by investigating how much pruning can be done without significant loss in accuracy. Our experiment results demonstrate that LXMERT can be effectively pruned by 40%-60% in size with 3% loss in accuracy.

PDF Abstract

Code

Add Remove Mark official

ghazaleh-mahmoodi/lxmert_compression official

Tasks

Add Remove

Model Compression

Visual Question Answering

Visual Question Answering (VQA)

Datasets

Visual Question Answering

Visual Question Answering v2.0

Results from the Paper

Edit

Ranked #2 on Visual Question Answering on VQA v2 test-std

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Result	Benchmark
Visual Question Answering	VQA v2 test-dev	LXMERT (low-magnitude pruning)	Accuracy	70.72	# 8		Compare
Visual Question Answering	VQA v2 test-std	LXMERT (low-magnitude pruning)	Accuracy	70.87	# 2		Compare

Methods

Add Remove

LXMERT • Pruning

Edit Social Preview

LXMERT Model Compression for Visual Question Answering

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove