Object Tracking
588 papers with code • 7 benchmarks • 62 datasets
Object tracking is the task of taking an initial set of object detections, creating a unique ID for each of the initial detections, and then tracking each of the objects as they move around frames in a video, maintaining the ID assignment. State-of-the-art methods involve fusing data from RGB and event-based cameras to produce more reliable object tracking. CNN-based models using only RGB images as input are also effective. The most popular benchmark is OTB. There are several evaluation metrics specific to object tracking, including HOTA, MOTA, IDF1, and Track-mAP.
( Image credit: Towards-Realtime-MOT )
Benchmarks
These leaderboards are used to track progress in Object Tracking
Libraries
Use these libraries to find Object Tracking models and implementationsDatasets
Subtasks
Latest papers
SceneTracker: Long-term Scene Flow Estimation Network
Considering the complementarity of scene flow estimation in the spatial domain's focusing capability and 3D object tracking in the temporal domain's coherence, this study aims to address a comprehensive new task that can simultaneously capture fine-grained and long-term 3D motion in an online manner: long-term scene flow estimation (LSFE).
OmniVid: A Generative Framework for Universal Video Understanding
The core of video understanding tasks, such as recognition, captioning, and tracking, is to automatically detect objects or actions in a video and analyze their temporal evolution.
Multiple Object Tracking as ID Prediction
In Multiple Object Tracking (MOT), tracking-by-detection methods have stood the test for a long time, which split the process into two parts according to the definition: object detection and association.
Elysium: Exploring Object-level Perception in Videos via MLLM
Multi-modal Large Language Models (MLLMs) have demonstrated their ability to perceive objects in still images, but their application in video-related tasks, such as object tracking, remains understudied.
SDSTrack: Self-Distillation Symmetric Adapter Learning for Multi-Modal Visual Object Tracking
Multimodal Visual Object Tracking (VOT) has recently gained significant attention due to its robustness.
PNAS-MOT: Multi-Modal Object Tracking with Pareto Neural Architecture Search
Multiple object tracking is a critical task in autonomous driving.
Fast-Poly: A Fast Polyhedral Framework For 3D Multi-Object Tracking
3D Multi-Object Tracking (MOT) captures stable and comprehensive motion states of surrounding obstacles, essential for robotic perception.
Lifting Multi-View Detection and Tracking to the Bird's Eye View
Taking advantage of multi-view aggregation presents a promising solution to tackle challenges such as occlusion and missed detection in multi-object tracking and detection.
NeuFlow: Real-time, High-accuracy Optical Flow Estimation on Robots Using Edge Devices
Given the features of the input images extracted at different spatial resolutions, global matching is employed to estimate an initial optical flow on the 1/16 resolution, capturing large displacement, which is then refined on the 1/8 resolution with lightweight CNN layers for better accuracy.
Long-term Frame-Event Visual Tracking: Benchmark Dataset and Baseline
Current event-/frame-event based trackers undergo evaluation on short-term tracking datasets, however, the tracking of real-world scenarios involves long-term tracking, and the performance of existing tracking algorithms in these scenarios remains unclear.