1 code implementation • 5 Mar 2023 • Kang Chen, Xiangqian Wu
The ideal form of Visual Question Answering requires understanding, grounding and reasoning in the joint space of vision and language and serves as a proxy for the AI task of scene understanding.
no code implementations • ICCV 2023 • Ding Ma, Xiangqian Wu
The main challenge of Tracking by Natural Language Specification (TNL) is to predict the movement of the target object by giving two heterogeneous information, e. g., one is the static description of the main characteristics of a video contained in the textual query, i. e., long-term context; the other one is an image patch containing the object and its surroundings cropped from the current frame, i. e., the search area.
no code implementations • CVPR 2021 • Ding Ma, Xiangqian Wu
Regression tracking has gained more and more attention thanks to its easy-to-implement characteristics, while existing regression trackers rarely consider the relationships between the object parts and the complete object.
6 code implementations • CVPR 2019 • Ting Zhao, Xiangqian Wu
To solve this problem, we propose Pyramid Feature Attention network to focus on effective high-level context features and low-level spatial structural features.
Ranked #1 on Saliency Detection on PASCAL-S
no code implementations • 26 Feb 2019 • Ding Ma, Xiangqian Wu
The critical challenge in tracking-by-detection framework is how to avoid drift problem during online learning, where the robust features for a variety of appearance changes are difficult to be learned and a reasonable intersection over union (IoU) threshold that defines the true/false positives is hard to set.
no code implementations • 18 Aug 2016 • Youbao Tang, Xiangqian Wu
This paper proposes a novel saliency detection method by combining region-level saliency estimation and pixel-level saliency prediction with CNNs (denoted as CRPSD).
no code implementations • 18 Aug 2016 • Youbao Tang, Xiangqian Wu, Wei Bu
This paper proposes a novel saliency detection method by developing a deeply-supervised recurrent convolutional neural network (DSRCNN), which performs a full image-to-image saliency prediction.