TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Rgb-T Tracking	LasHeR	ViPT	Precision	65.1	# 10
Rgb-T Tracking	LasHeR	ViPT	Success	52.5	# 10
Rgb-T Tracking	RGBT234	ViPT	Precision	83.5	# 12
Rgb-T Tracking	RGBT234	ViPT	Success	61.7	# 11

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/visual-prompt-multi-modal-tracking/rgb-t-tracking-on-lasher)](https://paperswithcode.com/sota/rgb-t-tracking-on-lasher?p=visual-prompt-multi-modal-tracking)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/visual-prompt-multi-modal-tracking/rgb-t-tracking-on-rgbt234)](https://paperswithcode.com/sota/rgb-t-tracking-on-rgbt234?p=visual-prompt-multi-modal-tracking)`

Visual Prompt Multi-Modal Tracking

CVPR 2023 · Jiawen Zhu, Simiao Lai, Xin Chen, Dong Wang, Huchuan Lu ·

Visible-modal object tracking gives rise to a series of downstream multi-modal tracking tributaries. To inherit the powerful representations of the foundation model, a natural modus operandi for multi-modal tracking is full fine-tuning on the RGB-based parameters. Albeit effective, this manner is not optimal due to the scarcity of downstream data and poor transferability, etc. In this paper, inspired by the recent success of the prompt learning in language models, we develop Visual Prompt multi-modal Tracking (ViPT), which learns the modal-relevant prompts to adapt the frozen pre-trained foundation model to various downstream multimodal tracking tasks. ViPT finds a better way to stimulate the knowledge of the RGB-based model that is pre-trained at scale, meanwhile only introducing a few trainable parameters (less than 1% of model parameters). ViPT outperforms the full fine-tuning paradigm on multiple downstream tracking tasks including RGB+Depth, RGB+Thermal, and RGB+Event tracking. Extensive experiments show the potential of visual prompt learning for multi-modal tracking, and ViPT can achieve state-of-the-art performance while satisfying parameter efficiency. Code and models are available at https://github.com/jiawen-zhu/ViPT.

PDF Abstract CVPR 2023 PDF CVPR 2023 Abstract

Code

Add Remove Mark official

jiawen-zhu/vipt official

227

Tasks

Add Remove

Object Tracking

Rgb-T Tracking

Datasets

LaSOT

GOT-10k

LasHeR

RGBT234

VisEvent

Results from the Paper

Edit

Ranked #10 on Rgb-T Tracking on LasHeR

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Rgb-T Tracking	LasHeR	ViPT	Precision	65.1	# 10	Compare
Rgb-T Tracking	LasHeR	ViPT	Success	52.5	# 10	Compare
Rgb-T Tracking	RGBT234	ViPT	Precision	83.5	# 12	Compare
Rgb-T Tracking	RGBT234	ViPT	Success	61.7	# 11	Compare

Methods

Add Remove

No methods listed for this paper. Add relevant methods here

Edit Social Preview

Visual Prompt Multi-Modal Tracking

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove