no code implementations • 25 Apr 2024 • Han Liu, Yinwei Wei, Xuemeng Song, Weili Guan, Yuan-Fang Li, Liqiang Nie
Multimodal recommendation aims to recommend user-preferred candidates based on her/his historically interacted items and associated multimodal information.
1 code implementation • 4 Apr 2024 • Tiantian Geng, Teng Wang, yanfu Zhang, Jinming Duan, Weili Guan, Feng Zheng
Video localization tasks aim to temporally locate specific instances in videos, including temporal action localization (TAL), sound event detection (SED) and audio-visual event localization (AVEL).
1 code implementation • 9 Jan 2024 • Xue Dong, Xuemeng Song, Tongliang Liu, Weili Guan
Multi-interest learning method for sequential recommendation aims to predict the next item according to user multi-faceted interests given the user historical interactions.
no code implementations • 11 Oct 2023 • Haoyu Zhang, Meng Liu, YaoWei Wang, Da Cao, Weili Guan, Liqiang Nie
In response to this gap, we present an iterative tracking and reasoning strategy that amalgamates a textual encoder, a visual encoder, and a generator.
no code implementations • ICCV 2023 • Baoshuo Kan, Teng Wang, Wenpeng Lu, XianTong Zhen, Weili Guan, Feng Zheng
Pre-trained vision-language models, e. g., CLIP, working with manually designed prompts have demonstrated great capacity of transfer learning.
1 code implementation • 2 Aug 2023 • Guojin Zhong, Jin Yuan, Pan Wang, Kailun Yang, Weili Guan, Zhiyong Li
The recently rising markup-to-image generation poses greater challenges as compared to natural image generation, due to its low tolerance for errors as well as the complex sequence and context correlations between markup and rendered image.
1 code implementation • ICCV 2023 • Dong Lu, Zhiqiang Wang, Teng Wang, Weili Guan, Hongchang Gao, Feng Zheng
Vision-language pre-training (VLP) models have shown vulnerability to adversarial examples in multimodal tasks.
no code implementations • 10 Apr 2023 • Zan Gao, Shenxun Wei, Weili Guan, Lei Zhu, Meng Wang, Shenyong Chen
Moreover, human semantic information and pedestrian identity information are not fully explored.
no code implementations • 18 Jul 2022 • Zan Gao, Hongwei Wei, Weili Guan, Jie Nie, Meng Wang, Shenyong Chen
In addition, a visual clothes shielding module (VCS) is also designed to extract a more robust feature representation for the cloth-changing task by covering the clothing regions and focusing the model on the visual semantic information unrelated to the clothes.
Cloth-Changing Person Re-Identification Semantic Segmentation
1 code implementation • 10 Jan 2022 • Ansong Li, Zhiyong Cheng, Fan Liu, Zan Gao, Weili Guan, Yuxin Peng
The session embedding is then generated by aggregating the item embeddings with attention weights of each item's factors.
no code implementations • 25 Sep 2021 • Zan Gao, Yuxiang Shao, Weili Guan, Meng Liu, Zhiyong Cheng, ShengYong Chen
Thus, we tackle this problem from the perspective of exploiting the relationships between patch features to capture long-range associations among multi-view images.
no code implementations • 10 Aug 2021 • Zan Gao, Hongwei Wei, Weili Guan, Weizhi Nie, Meng Liu, Meng Wang
To solve these issues, in this work, a novel multigranular visual-semantic embedding algorithm (MVSE) is proposed for cloth-changing person ReID, where visual semantic information and human attributes are embedded into the network, and the generalized features of human appearance can be well learned to effectively solve the problem of clothing changes.
no code implementations • 10 Aug 2021 • Zan Gao, Chao Sun, Zhiyong Cheng, Weili Guan, AnAn Liu, Meng Wang
In this work, a novel end-to-end two-stream boundary-aware network (abbreviated as TBNet) is proposed for generic image manipulation localization in which the RGB stream, the frequency stream, and the boundary artifact location are explored in a unified framework.
no code implementations • IJCNLP 2019 • Linmei Hu, Luhao Zhang, Chuan Shi, Liqiang Nie, Weili Guan, Cheng Yang
Distantly-supervised relation extraction has proven to be effective to find relational facts from texts.