Video Object Tracking
28 papers with code • 3 benchmarks • 11 datasets
Video Object Detection aims to detect targets in videos using both spatial and temporal information. It's usually deeply integrated with tasks such as Object Detection and Object Tracking.
Libraries
Use these libraries to find Video Object Tracking models and implementationsMost implemented papers
Revealing the Dark Secrets of Masked Image Modeling
In this paper, we compare MIM with the long-dominant supervised pre-trained models from two perspectives, the visualizations and the experiments, to uncover their key representational differences.
Learning What and Where: Disentangling Location and Identity Tracking Without Supervision
Moreover, it can anticipate object motion and interactions, which are crucial abilities for conceptual planning and reasoning.
A Real-Time Wrong-Way Vehicle Detection Based on YOLO and Centroid Tracking
By detecting wrong-way vehicles, the number of accidents can be minimized and traffic jam can be reduced.
Target-Aware Tracking with Long-term Context Attention
Most deep trackers still follow the guidance of the siamese paradigms and use a template that contains only the target without any contextual information, which makes it difficult for the tracker to cope with large appearance changes, rapid target movement, and attraction from similar objects.
Track Anything: Segment Anything Meets Videos
Therefore, in this report, we propose Track Anything Model (TAM), which achieves high-performance interactive tracking and segmentation in videos.
Single-Model and Any-Modality for Video Object Tracking
In practice, most existing RGB trackers learn a single set of parameters to use them across datasets and applications.
ARTrackV2: Prompting Autoregressive Tracker Where to Look and How to Describe
We present ARTrackV2, which integrates two pivotal aspects of tracking: determining where to look (localization) and how to describe (appearance analysis) the target object across video frames.
UniVS: Unified and Universal Video Segmentation with Prompts as Queries
Despite the recent advances in unified image segmentation (IS), developing a unified video segmentation (VS) model remains a challenge.