Search Results for author: Xiangxi Shi

Found 5 papers, 1 papers with code

Learning Meta-class Memory for Few-Shot Semantic Segmentation

1 code implementation ICCV 2021 Zhonghua Wu, Xiangxi Shi, Guosheng Lin, Jianfei Cai

To explicitly learn meta-class representations in few-shot segmentation task, we propose a novel Meta-class Memory based few-shot segmentation method (MM-Net), where we introduce a set of learnable memory embeddings to memorize the meta-class information during the base class training and transfer to novel classes during the inference stage.

Few-Shot Semantic Segmentation Segmentation +1

Remember What You have drawn: Semantic Image Manipulation with Memory

no code implementations27 Jul 2021 Xiangxi Shi, Zhonghua Wu, Guosheng Lin, Jianfei Cai, Shafiq Joty

Therefore, in this paper, we propose a memory-based Image Manipulation Network (MIM-Net), where a set of memories learned from images is introduced to synthesize the texture information with the guidance of the textual description.

Image Manipulation

Finding It at Another Side: A Viewpoint-Adapted Matching Encoder for Change Captioning

no code implementations ECCV 2020 Xiangxi Shi, Xu Yang, Jiuxiang Gu, Shafiq Joty, Jianfei Cai

In this paper, we propose a novel visual encoder to explicitly distinguish viewpoint changes from semantic changes in the change captioning task.

Reinforcement Learning (RL)

Watch It Twice: Video Captioning with a Refocused Video Encoder

no code implementations21 Jul 2019 Xiangxi Shi, Jianfei Cai, Shafiq Joty, Jiuxiang Gu

With the rapid growth of video data and the increasing demands of various applications such as intelligent video search and assistance toward visually-impaired people, video captioning task has received a lot of attention recently in computer vision and natural language processing fields.

Video Captioning

Video Captioning with Boundary-aware Hierarchical Language Decoding and Joint Video Prediction

no code implementations8 Jul 2018 Xiangxi Shi, Jianfei Cai, Jiuxiang Gu, Shafiq Joty

In this paper, we propose a boundary-aware hierarchical language decoder for video captioning, which consists of a high-level GRU based language decoder, working as a global (caption-level) language model, and a low-level GRU based language decoder, working as a local (phrase-level) language model.

Language Modelling Sentence +3

Cannot find the paper you are looking for? You can Submit a new open access paper.