Search Results for author: Fuhai Chen

Found 11 papers, 2 papers with code

Cross-modal Semantic Enhanced Interaction for Image-Sentence Retrieval

no code implementations17 Oct 2022 Xuri Ge, Fuhai Chen, Songpei Xu, Fuxiang Tao, Joemon M. Jose

To correlate the context of objects with the textual context, we further refine the visual semantic representation via the cross-level object-sentence and word-image based interactive attention.

Object Retrieval +1

Global2Local: A Joint-Hierarchical Attention for Video Captioning

no code implementations13 Mar 2022 Chengpeng Dai, Fuhai Chen, Xiaoshuai Sun, Rongrong Ji, Qixiang Ye, Yongjian Wu

Recently, automatic video captioning has attracted increasing attention, where the core challenge lies in capturing the key semantic items, like objects and actions as well as their spatial-temporal correlations from the redundant frames and semantic content.

Video Captioning

Weakly-Supervised Dense Action Anticipation

1 code implementation15 Nov 2021 Haotong Zhang, Fuhai Chen, Angela Yao

We present a (semi-) weakly supervised method using only a small number of fully-labelled sequences and predominantly sequences in which only the (one) upcoming action is labelled.

Action Anticipation

Structured Multi-modal Feature Embedding and Alignment for Image-Sentence Retrieval

no code implementations5 Aug 2021 Xuri Ge, Fuhai Chen, Joemon M. Jose, Zhilong Ji, Zhongqin Wu, Xiao Liu

In this work, we propose to address the above issue from two aspects: (i) constructing intrinsic structure (along with relations) among the fragments of respective modalities, e. g., "dog $\to$ play $\to$ ball" in semantic structure for an image, and (ii) seeking explicit inter-modal structural and semantic correspondence between the visual and textual modalities.

Retrieval Semantic correspondence +1

Improving Image Captioning by Leveraging Intra- and Inter-layer Global Representation in Transformer Network

1 code implementation13 Dec 2020 Jiayi Ji, Yunpeng Luo, Xiaoshuai Sun, Fuhai Chen, Gen Luo, Yongjian Wu, Yue Gao, Rongrong Ji

The latter contains a Global Adaptive Controller that can adaptively fuse the global information into the decoder to guide the caption generation.

Caption Generation Image Captioning

Variational Structured Semantic Inference for Diverse Image Captioning

no code implementations NeurIPS 2019 Fuhai Chen, Rongrong Ji, Jiayi Ji, Xiaoshuai Sun, Baochang Zhang, Xuri Ge, Yongjian Wu, Feiyue Huang, Yan Wang

To model these two inherent diversities in image captioning, we propose a Variational Structured Semantic Inferring model (termed VSSI-cap) executed in a novel structured encoder-inferer-decoder schema.

Image Captioning

Semantic-aware Image Deblurring

no code implementations9 Oct 2019 Fuhai Chen, Rongrong Ji, Chengpeng Dai, Xiaoshuai Sun, Chia-Wen Lin, Jiayi Ji, Baochang Zhang, Feiyue Huang, Liujuan Cao

Specially, we propose a novel Structured-Spatial Semantic Embedding model for image deblurring (termed S3E-Deblur), which introduces a novel Structured-Spatial Semantic tree model (S3-tree) to bridge two basic tasks in computer vision: image deblurring (ImD) and image captioning (ImC).

Deblurring Image Captioning +1

Scene-based Factored Attention for Image Captioning

no code implementations7 Aug 2019 Chen Shen, Rongrong Ji, Fuhai Chen, Xiaoshuai Sun, Xiangming Li

Specifically, the proposed module first embeds the scene concepts into factored weights explicitly and attends the visual information extracted from the input image.

Caption Generation Image Captioning +1

GroupCap: Group-Based Image Captioning With Structured Relevance and Diversity Constraints

no code implementations CVPR 2018 Fuhai Chen, Rongrong Ji, Xiaoshuai Sun, Yongjian Wu, Jinsong Su

In offline optimization, we adopt an end-to-end formulation, which jointly trains the visual tree parser, the structured relevance and diversity constraints, as well as the LSTM based captioning model.

Image Captioning

Cannot find the paper you are looking for? You can Submit a new open access paper.