Multiple Object Tracking from appearance by hierarchically clustering tracklets

7 Oct 2022  ·  Andreu Girbau, Ferran Marqués, Shin'ichi Satoh ·

Current approaches in Multiple Object Tracking (MOT) rely on the spatio-temporal coherence between detections combined with object appearance to match objects from consecutive frames. In this work, we explore MOT using object appearances as the main source of association between objects in a video, using spatial and temporal priors as weighting factors. We form initial tracklets by leveraging on the idea that instances of an object that are close in time should be similar in appearance, and build the final object tracks by fusing the tracklets in a hierarchical fashion. We conduct extensive experiments that show the effectiveness of our method over three different MOT benchmarks, MOT17, MOT20, and DanceTrack, being competitive in MOT17 and MOT20 and establishing state-of-the-art results in DanceTrack.

PDF Abstract

Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Uses Extra
Training Data
Result Benchmark
Multi-Object Tracking DanceTrack FCG HOTA 48.7 # 20
DetA 79.8 # 14
AssA 29.9 # 20
MOTA 89.9 # 15
IDF1 46.5 # 21
Multi-Object Tracking MOT17 FCG MOTA 76.7 # 12
IDF1 77.7 # 8
HOTA 62.6 # 10
Multi-Object Tracking MOT20 FCG MOTA 68.0 # 15
IDF1 69.7 # 14
HOTA 57.3 # 12

Methods


No methods listed for this paper. Add relevant methods here