Visual Object Tracking

150 papers with code • 21 benchmarks • 26 datasets

Visual Object Tracking is an important research topic in computer vision, image understanding and pattern recognition. Given the initial state (centre location and scale) of a target in the first frame of a video sequence, the aim of Visual Object Tracking is to automatically obtain the states of the object in the subsequent video frames.

Source: Learning Adaptive Discriminative Correlation Filters via Temporal Consistency Preserving Spatial Feature Selection for Robust Visual Object Tracking

Libraries

Use these libraries to find Visual Object Tracking models and implementations

LRR: Language-Driven Resamplable Continuous Representation against Adversarial Tracking Attacks

faceonlive/ai-research 9 Apr 2024

To achieve high accuracy on both clean and adversarial data, we propose building a spatial-temporal continuous representation using the semantic text guidance of the object of interest.

131
09 Apr 2024

OmniVid: A Generative Framework for Universal Video Understanding

wangjk666/omnivid 26 Mar 2024

The core of video understanding tasks, such as recognition, captioning, and tracking, is to automatically detect objects or actions in a video and analyze their temporal evolution.

16
26 Mar 2024

Elysium: Exploring Object-level Perception in Videos via MLLM

hon-wong/elysium 25 Mar 2024

Multi-modal Large Language Models (MLLMs) have demonstrated their ability to perceive objects in still images, but their application in video-related tasks, such as object tracking, remains understudied.

19
25 Mar 2024

SDSTrack: Self-Distillation Symmetric Adapter Learning for Multi-Modal Visual Object Tracking

hoqolo/sdstrack 24 Mar 2024

Multimodal Visual Object Tracking (VOT) has recently gained significant attention due to its robustness.

14
24 Mar 2024

VastTrack: Vast Category Visual Object Tracking

henglan/vasttrack 6 Mar 2024

The rich annotations of VastTrack enables development of both the vision-only and the vision-language tracking.

33
06 Mar 2024

Spatio-temporal Prompting Network for Robust Video Feature Extraction

guanxiongsun/vfe.pytorch ICCV 2023

Then, these video prompts are prepended to the patch embeddings of the current frame as the updated input for video feature extraction.

16
04 Feb 2024

Correlation-Embedded Transformer Tracking: A Single-Branch Framework

phiphiphi31/SBT 23 Jan 2024

Thus, we reformulate the two-branch Siamese tracking as a conceptually simple, fully transformer-based Single-Branch Tracking pipeline, dubbed SBT.

13
23 Jan 2024

Explicit Visual Prompts for Visual Object Tracking

GXNU-ZhongLab/EVPTrack 6 Jan 2024

Specifically, we utilize spatio-temporal tokens to propagate information between consecutive frames without focusing on updating templates.

6
06 Jan 2024

ODTrack: Online Dense Temporal Token Learning for Visual Tracking

gxnu-zhonglab/odtrack 3 Jan 2024

To alleviate the above problem, we propose a simple, flexible and effective video-level tracking pipeline, named \textbf{ODTrack}, which densely associates the contextual relationships of video frames in an online token propagation manner.

72
03 Jan 2024

ARTrackV2: Prompting Autoregressive Tracker Where to Look and How to Describe

miv-xjtu/artrack 28 Dec 2023

We present ARTrackV2, which integrates two pivotal aspects of tracking: determining where to look (localization) and how to describe (appearance analysis) the target object across video frames.

185
28 Dec 2023