Search Results for author: Fuhai Chen

Found 11 papers, 2 papers with code

Cross-modal Semantic Enhanced Interaction for Image-Sentence Retrieval

no code implementations • 17 Oct 2022 • Xuri Ge, Fuhai Chen, Songpei Xu, Fuxiang Tao, Joemon M. Jose

To correlate the context of objects with the textual context, we further refine the visual semantic representation via the cross-level object-sentence and word-image based interactive attention.

Object Retrieval +1

Paper
Add Code

Global2Local: A Joint-Hierarchical Attention for Video Captioning

no code implementations • 13 Mar 2022 • Chengpeng Dai, Fuhai Chen, Xiaoshuai Sun, Rongrong Ji, Qixiang Ye, Yongjian Wu

Recently, automatic video captioning has attracted increasing attention, where the core challenge lies in capturing the key semantic items, like objects and actions as well as their spatial-temporal correlations from the redundant frames and semantic content.

Video Captioning

Paper
Add Code

Factored Attention and Embedding for Unstructured-view Topic-related Ultrasound Report Generation

no code implementations • 12 Mar 2022 • Fuhai Chen, Rongrong Ji, Chengpeng Dai, Xuri Ge, Shengchuang Zhang, Xiaojing Ma, Yue Gao

Echocardiography is widely used to clinical practice for diagnosis and treatment, e. g., on the common congenital heart defects.

Decision Making Medical Report Generation

Paper
Add Code

Differentiated Relevances Embedding for Group-based Referring Expression Comprehension

no code implementations • 12 Mar 2022 • Fuhai Chen, Xuri Ge, Xiaoshuai Sun, Yue Gao, Jianzhuang Liu, Fufeng Chen, Wenjie Li

The key of referring expression comprehension lies in capturing the cross-modal visual-linguistic relevance.

Attribute Object +2

Paper
Add Code

Weakly-Supervised Dense Action Anticipation

1 code implementation • 15 Nov 2021 • Haotong Zhang, Fuhai Chen, Angela Yao

We present a (semi-) weakly supervised method using only a small number of fully-labelled sequences and predominantly sequences in which only the (one) upcoming action is labelled.

Action Anticipation

Paper
Code

Structured Multi-modal Feature Embedding and Alignment for Image-Sentence Retrieval

no code implementations • 5 Aug 2021 • Xuri Ge, Fuhai Chen, Joemon M. Jose, Zhilong Ji, Zhongqin Wu, Xiao Liu

In this work, we propose to address the above issue from two aspects: (i) constructing intrinsic structure (along with relations) among the fragments of respective modalities, e. g., "dog $\to$ play $\to$ ball" in semantic structure for an image, and (ii) seeking explicit inter-modal structural and semantic correspondence between the visual and textual modalities.

Retrieval Semantic correspondence +1

Paper
Add Code

Improving Image Captioning by Leveraging Intra- and Inter-layer Global Representation in Transformer Network

1 code implementation • 13 Dec 2020 • Jiayi Ji, Yunpeng Luo, Xiaoshuai Sun, Fuhai Chen, Gen Luo, Yongjian Wu, Yue Gao, Rongrong Ji

The latter contains a Global Adaptive Controller that can adaptively fuse the global information into the decoder to guide the caption generation.

Caption Generation Image Captioning

193

Paper
Code

Variational Structured Semantic Inference for Diverse Image Captioning

no code implementations • NeurIPS 2019 • Fuhai Chen, Rongrong Ji, Jiayi Ji, Xiaoshuai Sun, Baochang Zhang, Xuri Ge, Yongjian Wu, Feiyue Huang, Yan Wang

To model these two inherent diversities in image captioning, we propose a Variational Structured Semantic Inferring model (termed VSSI-cap) executed in a novel structured encoder-inferer-decoder schema.

Image Captioning

Paper
Add Code

Semantic-aware Image Deblurring

no code implementations • 9 Oct 2019 • Fuhai Chen, Rongrong Ji, Chengpeng Dai, Xiaoshuai Sun, Chia-Wen Lin, Jiayi Ji, Baochang Zhang, Feiyue Huang, Liujuan Cao

Specially, we propose a novel Structured-Spatial Semantic Embedding model for image deblurring (termed S3E-Deblur), which introduces a novel Structured-Spatial Semantic tree model (S3-tree) to bridge two basic tasks in computer vision: image deblurring (ImD) and image captioning (ImC).

Deblurring Image Captioning +1

Paper
Add Code

Scene-based Factored Attention for Image Captioning

no code implementations • 7 Aug 2019 • Chen Shen, Rongrong Ji, Fuhai Chen, Xiaoshuai Sun, Xiangming Li

Specifically, the proposed module first embeds the scene concepts into factored weights explicitly and attends the visual information extracted from the input image.

Caption Generation Image Captioning +1

Paper
Add Code

GroupCap: Group-Based Image Captioning With Structured Relevance and Diversity Constraints

no code implementations • CVPR 2018 • Fuhai Chen, Rongrong Ji, Xiaoshuai Sun, Yongjian Wu, Jinsong Su

In offline optimization, we adopt an end-to-end formulation, which jointly trains the visual tree parser, the structured relevance and diversity constraints, as well as the LSTM based captioning model.

Image Captioning

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.