Search Results for author: Zerun Feng

Found 6 papers, 1 papers with code

ProTA: Probabilistic Token Aggregation for Text-Video Retrieval

no code implementations • 18 Apr 2024 • Han Fang, Xianghao Zang, Chao Ban, Zerun Feng, Lanxiang Zhou, Zhongjiang He, Yongxiang Li, Hao Sun

Text-video retrieval aims to find the most relevant cross-modal samples for a given query.

Retrieval Video Retrieval

Paper
Add Code

Integrating Listwise Ranking into Pairwise-based Image-Text Retrieval

1 code implementation • 26 May 2023 • Zheng Li, Caili Guo, Xin Wang, Zerun Feng, Yanjun Wang

Given a query caption, the goal is to rank candidate images by relevance, from large to small.

Retrieval Text Retrieval

Paper
Code

Selectively Hard Negative Mining for Alleviating Gradient Vanishing in Image-Text Matching

no code implementations • 1 Mar 2023 • Zheng Li, Caili Guo, Xin Wang, Zerun Feng, Zhongtian Du

To alleviate the gradient vanishing problem, we propose a Selectively Hard Negative Mining (SelHN) strategy, which chooses whether to mine hard negative samples according to the gradient vanishing condition.

Image-text matching Text Matching

Paper
Add Code

Image-Text Retrieval with Binary and Continuous Label Supervision

no code implementations • 20 Oct 2022 • Zheng Li, Caili Guo, Zerun Feng, Jenq-Neng Hwang, Ying Jin, Yufeng Zhang

Such a binary indicator covers only a limited subset of image-text semantic relations, which is insufficient to represent relevance degrees between images and texts described by continuous labels such as image captions.

Image Captioning Retrieval +2

Paper
Add Code

Unified Loss of Pair Similarity Optimization for Vision-Language Retrieval

no code implementations • 28 Sep 2022 • Zheng Li, Caili Guo, Xin Wang, Zerun Feng, Jenq-Neng Hwang, Zhongtian Du

More specifically, Triplet loss with Hard Negative mining (Triplet-HN), which is widely used in existing retrieval models to improve the discriminative ability, is easy to fall into local minima in training.

Contrastive Learning Retrieval +2

Paper
Add Code

Exploiting Visual Semantic Reasoning for Video-Text Retrieval

no code implementations • 16 Jun 2020 • Zerun Feng, Zhimin Zeng, Caili Guo, Zheng Li

Finally, the region features are aggregated to form frame-level features for further encoding to measure video-text similarity.

Retrieval Text Retrieval +2

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.