no code implementations • 15 Aug 2022 • Guoping Zhao, Bingqing Zhang, Mingyu Zhang, Yaxian Li, Jiajun Liu, Ji-Rong Wen
It models a video with a lattice feature graph in which the nodes represent regions of different granularity, with weighted edges that represent the spatial and temporal links.
1 code implementation • 15 Aug 2022 • Yaxian Li, Bingqing Zhang, Guoping Zhao, Mingyu Zhang, Jiajun Liu, Ziwei Wang, JiRong Wen
After a survey for person-tracking system-induced privacy concerns, we propose a black-box adversarial attack method on state-of-the-art human detection models called InvisibiliTee.
no code implementations • 18 Apr 2022 • Xun Wang, Bingqing Ke, Xuanping Li, Fangyu Liu, Mingyu Zhang, Xiao Liang, Qiushi Xiao, Cheng Luo, Yue Yu
This modality imbalanceresults from a) modality gap: the relevance between a query and a video text is much easier to learn as the query is also a piece of text, with the same modality as the video text; b) data bias: most training samples can be solved solely by text matching.
no code implementations • 12 Jul 2019 • Guoping Zhao, Mingyu Zhang, Jiajun Liu, Ji-Rong Wen
Such tendency indicates that the model indeed learned how to toy with both image retrieval systems and human eyes.