Visual Object Tracking
150 papers with code • 21 benchmarks • 26 datasets
Visual Object Tracking is an important research topic in computer vision, image understanding and pattern recognition. Given the initial state (centre location and scale) of a target in the first frame of a video sequence, the aim of Visual Object Tracking is to automatically obtain the states of the object in the subsequent video frames.
Libraries
Use these libraries to find Visual Object Tracking models and implementationsLatest papers
Event Stream-based Visual Object Tracking: A High-Resolution Benchmark Dataset and A Novel Baseline
Tracking using bio-inspired event cameras has drawn more and more attention in recent years.
Mobile Vision Transformer-based Visual Object Tracking
We propose a lightweight, accurate, and fast tracking algorithm using Mobile Vision Transformers (MobileViT) as the backbone for the first time.
Separable Self and Mixed Attention Transformers for Efficient Object Tracking
Our ablation study testifies to the effectiveness of the proposed combination of backbone and head modules.
Improving Underwater Visual Tracking With a Large Scale Dataset and Image Enhancement
The method has resulted in a significant performance improvement, of up to 5. 0% AUC, of state-of-the-art (SOTA) visual trackers.
Integrating Boxes and Masks: A Multi-Object Framework for Unified Visual Tracking and Segmentation
Tracking any given object(s) spatially and temporally is a common purpose in Visual Object Tracking (VOT) and Video Object Segmentation (VOS).
360VOT: A New Benchmark Dataset for Omnidirectional Visual Object Tracking
360{\deg} images can provide an omnidirectional field of view which is important for stable and long-term scene perception.
Tracking Anything in High Quality
To further improve the quality of tracking masks, a pretrained MR model is employed to refine the tracking results.
Cross-Drone Transformer Network for Robust Single Object Tracking
During the tracking process, a cross-drone mapping mechanism is proposed by using the surrounding information of the drone with promising tracking status as reference, assisting drones that lost targets to re-calibrate, which implements real-time cross-drone information interaction.
Unified Sequence-to-Sequence Learning for Single- and Multi-Modal Visual Object Tracking
In this paper, we introduce a new sequence-to-sequence learning framework for RGB-based and multi-modal object tracking.
DropMAE: Masked Autoencoders with Spatial-Attention Dropout for Tracking Tasks
However, we find that this simple baseline heavily relies on spatial cues while ignoring temporal relations for frame reconstruction, thus leading to sub-optimal temporal matching representations for VOT and VOS.