no code implementations • 17 May 2024 • Zikun Zhou, Wentao Xiong, Li Zhou, Xin Li, Zhenyu He, YaoWei Wang
As images and texts are mapped to uncoupled feature spaces, they face the arduous task of learning Vision-Language~(VL) relation modeling from scratch.
Referring Expression Segmentation Referring Video Object Segmentation +3
1 code implementation • 16 May 2024 • Zhilin Huang, Quanmin Liang, Yijie Yu, Chujun Qin, Xiawu Zheng, Kai Huang, Zikun Zhou, Wenming Yang
In this paper, we propose a bilateral event mining and complementary network (BMCNet) to fully leverage the potential of each event and capture the shared information to complement each other simultaneously.
no code implementations • 21 Apr 2024 • Zhilin Huang, Yijie Yu, Ling Yang, Chujun Qin, Bing Zheng, Xiawu Zheng, Zikun Zhou, YaoWei Wang, Wenming Yang
With the advancement of AIGC, video frame interpolation (VFI) has become a crucial component in existing video generation frameworks, attracting widespread research interest.
1 code implementation • 28 Mar 2024 • Yuqing Huang, Xin Li, Zikun Zhou, YaoWei Wang, Zhenyu He, Ming-Hsuan Yang
Upon the PN tree memory, we develop corresponding walking rules for determining the state of the target and define a set of control flows to unite the tracker and the detector in different tracking scenarios.
no code implementations • 17 Dec 2023 • Jingwen Zhang, Zikun Zhou, Guangming Lu, Jiandong Tian, Wenjie Pei
Considering that, we propose to construct a synthetic target representation composed of dense and complete point clouds depicting the target shape precisely by shape completion for robust 3D tracking.
no code implementations • 24 Aug 2023 • Zikun Zhou, Shukun Wu, Guoqing Zhu, Hongpeng Wang, Zhenyu He
In this paper, we propose a Channel and Spatial Relation-Propagation Network (CSRPNet) for RGB-T semantic segmentation, which propagates only modality-shared information across different modalities and alleviates the modality-specific information contamination issue.
Ranked #12 on Thermal Image Segmentation on PST900
no code implementations • 23 Aug 2023 • Chao Tian, Zikun Zhou, Yuqing Huang, Gaojun Li, Zhenyu He
RGB-Thermal (RGB-T) pedestrian detection aims to locate the pedestrians in RGB-T image pairs to exploit the complementation between the two modalities for improving detection robustness in extreme conditions.
1 code implementation • 25 Mar 2023 • Zikun Zhou, Kaige Mao, Wenjie Pei, Hongpeng Wang, YaoWei Wang, Zhenyu He
To be specific, RHMNet first only uses the memory in the high-reliability level to locate the region with high reliability belonging to the target, which is highly similar to the initial target scribble.
1 code implementation • CVPR 2023 • Li Zhou, Zikun Zhou, Kaige Mao, Zhenyu He
Such a separated framework overlooks the link between visual grounding and tracking, which is that the natural language descriptions provide global semantic cues for localizing the target for both two steps.
Ranked #3 on Visual Tracking on TNL2K
1 code implementation • CVPR 2022 • Zikun Zhou, Jianqiu Chen, Wenjie Pei, Kaige Mao, Hongpeng Wang, Zhenyu He
While it can exploit the temporal context like historical appearances and locations of the target, a potential limitation of such strategy is that the local tracker tends to misidentify a nearby distractor as the target instead of activating the re-detector when the real target is out of view.
1 code implementation • ICCV 2021 • Zikun Zhou, Wenjie Pei, Xin Li, Hongpeng Wang, Feng Zheng, Zhenyu He
A potential limitation of such trackers is that not all patches are equally informative for tracking.
1 code implementation • 15 Apr 2021 • Kai Yang, Zhenyu He, Wenjie Pei, Zikun Zhou, Xin Li, Di Yuan, Haijun Zhang
By tracking a target as a pair of corners, we avoid the need to design the anchor boxes.
1 code implementation • 3 Aug 2020 • Qiao Liu, Xin Li, Zhenyu He, Chenglong Li, Jun Li, Zikun Zhou, Di Yuan, Jing Li, Kai Yang, Nana Fan, Feng Zheng
We evaluate and analyze more than 30 trackers on LSOTB-TIR to provide a series of baselines, and the results show that deep trackers achieve promising performance.
Thermal Infrared Object Tracking Vocal Bursts Intensity Prediction