Object Tracking

588 papers with code • 7 benchmarks • 62 datasets

Object tracking is the task of taking an initial set of object detections, creating a unique ID for each of the initial detections, and then tracking each of the objects as they move around frames in a video, maintaining the ID assignment. State-of-the-art methods involve fusing data from RGB and event-based cameras to produce more reliable object tracking. CNN-based models using only RGB images as input are also effective. The most popular benchmark is OTB. There are several evaluation metrics specific to object tracking, including HOTA, MOTA, IDF1, and Track-mAP.

( Image credit: Towards-Realtime-MOT )

Benchmarks

Add a Result

These leaderboards are used to track progress in Object Tracking

Dataset	Best Model	Compare
COESOT	HR-CEUTrack-Large	See all
FE108	HR-MonTrack-Base	See all
SeaDronesSee	DiMP50	See all
KITTI	M2-Track	See all
MMPTRACK	UMMT	See all
VisEvent	RT-MDNet	See all
BIRDSAI - ICVGIP 2020	final	See all

Libraries

Use these libraries to find Object Tracking models and implementations

visionml/pytracking

9 papers

3,094

PaddlePaddle/PaddleDetection

8 papers

12,095

open-mmlab/mmtracking

6 papers

3,384

mikel-brostrom/yolo_tracking

5 papers

6,130

See all 6 libraries.

Datasets

Subtasks

Cell Tracking

Video Object Tracking

Online Multi-Object Tracking

Thermal Infrared Object Tracking

Sports Ball Detection and Tracking

Pupil Tracking

Amodal Tracking

Latest papers

Most implemented Social Latest No code

SceneTracker: Long-term Scene Flow Estimation Network

wwsource/scenetracker • • 29 Mar 2024

Considering the complementarity of scene flow estimation in the spatial domain's focusing capability and 3D object tracking in the temporal domain's coherence, this study aims to address a comprehensive new task that can simultaneously capture fine-grained and long-term 3D motion in an online manner: long-term scene flow estimation (LSFE).

29 Mar 2024

Paper
Code

OmniVid: A Generative Framework for Universal Video Understanding

wangjk666/omnivid • • 26 Mar 2024

The core of video understanding tasks, such as recognition, captioning, and tracking, is to automatically detect objects or actions in a video and analyze their temporal evolution.

26 Mar 2024

Paper
Code

Multiple Object Tracking as ID Prediction

MCG-NJU/MOTIP • • 25 Mar 2024

In Multiple Object Tracking (MOT), tracking-by-detection methods have stood the test for a long time, which split the process into two parts according to the definition: object detection and association.

25 Mar 2024

Paper
Code

Elysium: Exploring Object-level Perception in Videos via MLLM

hon-wong/elysium • 25 Mar 2024

Multi-modal Large Language Models (MLLMs) have demonstrated their ability to perceive objects in still images, but their application in video-related tasks, such as object tracking, remains understudied.

25 Mar 2024

Paper
Code

SDSTrack: Self-Distillation Symmetric Adapter Learning for Multi-Modal Visual Object Tracking

hoqolo/sdstrack • • 24 Mar 2024

Multimodal Visual Object Tracking (VOT) has recently gained significant attention due to its robustness.

24 Mar 2024

Paper
Code

PNAS-MOT: Multi-Modal Object Tracking with Pareto Neural Architecture Search

pholypeng/pnas-mot • • 23 Mar 2024

Multiple object tracking is a critical task in autonomous driving.

23 Mar 2024

Paper
Code

Fast-Poly: A Fast Polyhedral Framework For 3D Multi-Object Tracking

lixiaoyu2000/fastpoly • 20 Mar 2024

3D Multi-Object Tracking (MOT) captures stable and comprehensive motion states of surrounding obstacles, essential for robotic perception.

20 Mar 2024

Paper
Code

Lifting Multi-View Detection and Tracking to the Bird's Eye View

tteepe/tracktacular • • 19 Mar 2024

Taking advantage of multi-view aggregation presents a promising solution to tackle challenges such as occlusion and missed detection in multi-object tracking and detection.

19 Mar 2024

Paper
Code

NeuFlow: Real-time, High-accuracy Optical Flow Estimation on Robots Using Edge Devices

neufieldrobotics/neuflow • • 15 Mar 2024

Given the features of the input images extracted at different spatial resolutions, global matching is employed to estimate an initial optical flow on the 1/16 resolution, capturing large displacement, which is then refined on the 1/8 resolution with lightweight CNN layers for better accuracy.

15 Mar 2024

Paper
Code

Long-term Frame-Event Visual Tracking: Benchmark Dataset and Baseline

event-ahu/eventvot_benchmark • • 9 Mar 2024

Current event-/frame-event based trackers undergo evaluation on short-term tracking datasets, however, the tracking of real-world scenarios involves long-term tracking, and the performance of existing tracking algorithms in these scenarios remains unclear.

09 Mar 2024

Paper
Code

Object Tracking

Benchmarks Add a Result

Libraries

Datasets

Subtasks

Latest papers

Content

Benchmarks

Add a Result