no code implementations • 8 Apr 2024 • Xiaoyi Bao, Siyang Sun, Shuailei Ma, Kecheng Zheng, Yuxin Guo, Guosheng Zhao, Yun Zheng, Xingang Wang
We believe that the act of reasoning segmentation should mirror the cognitive stages of human visual search, where each step is a progressive refinement of thought toward the final object.
1 code implementation • NeurIPS 2023 • Yuxin Guo, Shijie Ma, Hu Su, Zhiqing Wang, Yuhao Zhao, Wei Zou, Siyang Sun, Yun Zheng
Audio-Visual Source Localization (AVSL) aims to locate sounding objects within video frames given the paired audio clips.
no code implementations • 18 Dec 2023 • Shuailei Ma, Chen-Wei Xie, Ying WEI, Siyang Sun, Jiaqi Fan, Xiaoyi Bao, Yuxin Guo, Yun Zheng
In this paper, we conduct a direct analysis of the multi-modal prompts by asking the following questions: $(i)$ How do the learned multi-modal prompts improve the recognition performance?
no code implementations • 11 Dec 2023 • Xiaoyi Bao, Jie Qin, Siyang Sun, Yun Zheng, Xingang Wang
To improve the semantic consistency of foreground instances, we propose an unlabeled branch as an efficient data utilization method, which teaches the model how to extract intrinsic features robust to intra-class differences.
no code implementations • CVPR 2023 • Chen-Wei Xie, Siyang Sun, Xiong Xiong, Yun Zheng, Deli Zhao, Jingren Zhou
This process can be considered as an open-book exam: with the reference set as a cheat sheet, the proposed method doesn't need to memorize all visual concepts in the training data.
no code implementations • 9 Feb 2021 • Yanhao Zhang, Qiang Wang, Pan Pan, Yun Zheng, Cheng Da, Siyang Sun, Yinghui Xu
Nowadays, live-stream and short video shopping in E-commerce have grown exponentially.
no code implementations • 24 Sep 2019 • Hongxuan Ma, Wei Zou, Zheng Zhu, Siyang Sun, Zhaobing Kang
In the field of navigation and visual servo, it is common to calculate relative pose by feature points on markers, so keeping markers in camera's view is an important problem.
no code implementations • 19 Apr 2019 • Siyang Sun, Yingjie Yin, Xingang Wang, De Xu, Yuan Zhao, Haifeng Shen
To address this problem, we propose a multiple receptive field and small-object-focusing weakly-supervised segmentation network (MRFSWSnet) to achieve fast object detection.
no code implementations • 27 Feb 2019 • Yiming Hu, Siyang Sun, Jianquan Li, Jiagang Zhu, Xingang Wang, Qingyi Gu
Particularly, we introduce an additional loss to encode the differences in the feature and semantic distributions within feature maps between the baseline model and the pruned one.
no code implementations • 29 May 2018 • Yiming Hu, Siyang Sun, Jianquan Li, Xingang Wang, Qingyi Gu
In order to accelerate the selection process, the proposed method formulates it as a search problem, which can be solved efficiently by genetic algorithm.