Search Results for author: Xinglin Hou

Found 6 papers, 0 papers with code

Towards Efficient and Effective Text-to-Video Retrieval with Coarse-to-Fine Visual Representation Learning

no code implementations1 Jan 2024 Kaibin Tian, Yanhua Cheng, Yi Liu, Xinglin Hou, Quan Chen, Han Li

To address this issue, we adopt multi-granularity visual feature learning, ensuring the model's comprehensiveness in capturing visual content features spanning from abstract to detailed levels during the training phase.

Representation Learning Retrieval +3

Visual Captioning at Will: Describing Images and Videos Guided by a Few Stylized Sentences

no code implementations31 Jul 2023 Dingyi Yang, Hongyu Chen, Xinglin Hou, Tiezheng Ge, Yuning Jiang, Qin Jin

To address these limitations, we explore the problem of Few-Shot Stylized Visual Captioning, which aims to generate captions in any desired style, using only a few examples as guidance during inference, without requiring further training.

Image Captioning Language Modelling

Edit As You Wish: Video Description Editing with Multi-grained Commands

no code implementations15 May 2023 Linli Yao, Yuanmeng Zhang, Ziheng Wang, Xinglin Hou, Tiezheng Ge, Yuning Jiang, Qin Jin

In this paper, we propose a novel Video Description Editing (VDEdit) task to automatically revise an existing video description guided by flexible user requests.

Attribute Position +3

Attract me to Buy: Advertisement Copywriting Generation with Multimodal Multi-structured Information

no code implementations7 May 2022 Zhipeng Zhang, Xinglin Hou, Kai Niu, Zhongzhen Huang, Tiezheng Ge, Yuning Jiang, Qi Wu, Peng Wang

Therefore, we present a dataset, E-MMAD (e-commercial multimodal multi-structured advertisement copywriting), which requires, and supports much more detailed information in text generation.

Text Generation Video Captioning

Dual-Level Decoupled Transformer for Video Captioning

no code implementations6 May 2022 Yiqi Gao, Xinglin Hou, Wei Suo, Mengyang Sun, Tiezheng Ge, Yuning Jiang, Peng Wang

As for the latter, \textbf{\textit{"couple"}} means treating the generation of visual semantic and syntax-related words equally.

Descriptive Sentence +1

CapOnImage: Context-driven Dense-Captioning on Image

no code implementations27 Apr 2022 Yiqi Gao, Xinglin Hou, Yuanmeng Zhang, Tiezheng Ge, Yuning Jiang, Peng Wang

Existing image captioning systems are dedicated to generating narrative captions for images, which are spatially detached from the image in presentation.

Dense Captioning Image Captioning

Cannot find the paper you are looking for? You can Submit a new open access paper.