The ability to evaluate the TIR pedestrian tracker fairly, on a benchmark dataset, is significant for the development of this field.
We evaluate and analyze more than 30 trackers on LSOTB-TIR to provide a series of baselines, and the results show that deep trackers achieve promising performance.
These two feature models are learned using a multi-task matching framework and are jointly optimized on the TIR tracking task.
In this paper, we cast the TIR tracking problem as a similarity verification task, which is coupled well to the objective of the tracking task.
We evaluated the performance of the system by training it to recognise 32 material types in both indoor and outdoor environments.
We observe that the features from the fully-connected layer are not suitable for thermal infrared tracking due to the lack of spatial information of the target, while the features from the convolution layers are.
These two similarities complement each other and hence enhance the discriminative capacity of the network for handling distractors.