RGB-T Tracking via Multi-Modal Mutual Prompt Learning

31 Aug 2023  ·  Yang Luo, Xiqing Guo, Hui Feng, Lei Ao ·

Object tracking based on the fusion of visible and thermal im-ages, known as RGB-T tracking, has gained increasing atten-tion from researchers in recent years. How to achieve a more comprehensive fusion of information from the two modalities with fewer computational costs has been a problem that re-searchers have been exploring. Recently, with the rise of prompt learning in computer vision, we can better transfer knowledge from visual large models to downstream tasks. Considering the strong complementarity between visible and thermal modalities, we propose a tracking architecture based on mutual prompt learning between the two modalities. We also design a lightweight prompter that incorporates attention mechanisms in two dimensions to transfer information from one modality to the other with lower computational costs, embedding it into each layer of the backbone. Extensive ex-periments have demonstrated that our proposed tracking ar-chitecture is effective and efficient, achieving state-of-the-art performance while maintaining high running speeds.

PDF Abstract

Datasets


Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Benchmark
Rgb-T Tracking LasHeR MPLT Precision 72.0 # 3
Success 57.1 # 3
Rgb-T Tracking RGBT210 MPLT Precision 86.2 # 1
Success 63.0 # 1
Rgb-T Tracking RGBT234 MPLT Precision 88.4 # 4
Success 65.7 # 5

Methods


No methods listed for this paper. Add relevant methods here