Search Results for author: Jiayi Ji

Found 18 papers, 16 papers with code

Rotated Multi-Scale Interaction Network for Referring Remote Sensing Image Segmentation

1 code implementation19 Dec 2023 Sihan Liu, Yiwei Ma, Xiaoqing Zhang, Haowei Wang, Jiayi Ji, Xiaoshuai Sun, Rongrong Ji

Referring Remote Sensing Image Segmentation (RRSIS) is a new challenge that combines computer vision and natural language processing, delineating specific regions in aerial images as described by textual queries.

Image Segmentation Segmentation +1

X-Dreamer: Creating High-quality 3D Content by Bridging the Domain Gap Between Text-to-2D and Text-to-3D Generation

1 code implementation30 Nov 2023 Yiwei Ma, Yijun Fan, Jiayi Ji, Haowei Wang, Xiaoshuai Sun, Guannan Jiang, Annan Shu, Rongrong Ji

Nevertheless, a substantial domain gap exists between 2D images and 3D assets, primarily attributed to variations in camera-related attributes and the exclusive presence of foreground objects.

3D Generation Text to 3D

Semi-Supervised Panoptic Narrative Grounding

1 code implementation27 Oct 2023 Danni Yang, Jiayi Ji, Xiaoshuai Sun, Haowei Wang, Yinan Li, Yiwei Ma, Rongrong Ji

Remarkably, our SS-PNG-NW+ outperforms fully-supervised models with only 30% and 50% supervision data, exceeding their performance by 0. 8% and 1. 1% respectively.

Data Augmentation Pseudo Label

NICE: Improving Panoptic Narrative Detection and Segmentation with Cascading Collaborative Learning

1 code implementation17 Oct 2023 Haowei Wang, Jiayi Ji, Tianyu Guo, Yilong Yang, Yiyi Zhou, Xiaoshuai Sun, Rongrong Ji

To address this, we introduce two cascading modules based on the barycenter of the mask, which are Coordinate Guided Aggregation (CGA) and Barycenter Driven Localization (BDL), responsible for segmentation and detection, respectively.

Segmentation Visual Grounding

JM3D & JM3D-LLM: Elevating 3D Understanding with Joint Multi-modal Cues

1 code implementation14 Oct 2023 Jiayi Ji, Haowei Wang, Changli Wu, Yiwei Ma, Xiaoshuai Sun, Rongrong Ji

The rising importance of 3D understanding, pivotal in computer vision, autonomous driving, and robotics, is evident.

Autonomous Driving Representation Learning

3D-STMN: Dependency-Driven Superpoint-Text Matching Network for End-to-End 3D Referring Expression Segmentation

1 code implementation31 Aug 2023 Changli Wu, Yiwei Ma, Qi Chen, Haowei Wang, Gen Luo, Jiayi Ji, Xiaoshuai Sun

In 3D Referring Expression Segmentation (3D-RES), the earlier approach adopts a two-stage paradigm, extracting segmentation proposals and then matching them with referring expressions.

Navigate Referring Expression +3

MMAPS: End-to-End Multi-Grained Multi-Modal Attribute-Aware Product Summarization

1 code implementation22 Aug 2023 Tao Chen, Ze Lin, Hui Li, Jiayi Ji, Yiyi Zhou, Guanbin Li, Rongrong Ji

Furthermore, we model product attributes based on both text and image modalities so that multi-modal product characteristics can be manifested in the generated summaries.

Attribute

X-Mesh: Towards Fast and Accurate Text-driven 3D Stylization via Dynamic Textual Guidance

1 code implementation ICCV 2023 Yiwei Ma, Xiaioqing Zhang, Xiaoshuai Sun, Jiayi Ji, Haowei Wang, Guannan Jiang, Weilin Zhuang, Rongrong Ji

Text-driven 3D stylization is a complex and crucial task in the fields of computer vision (CV) and computer graphics (CG), aimed at transforming a bare mesh to fit a target text.

Attribute

Towards Local Visual Modeling for Image Captioning

1 code implementation13 Feb 2023 Yiwei Ma, Jiayi Ji, Xiaoshuai Sun, Yiyi Zhou, Rongrong Ji

In this paper, we study the local visual modeling with grid features for image captioning, which is critical for generating accurate and detailed captions.

Image Captioning Object Recognition

Towards Real-Time Panoptic Narrative Grounding by an End-to-End Grounding Network

1 code implementation9 Jan 2023 Haowei Wang, Jiayi Ji, Yiyi Zhou, Yongjian Wu, Xiaoshuai Sun

Extensive experiments on the PNG benchmark dataset demonstrate the effectiveness and efficiency of our method.

RSTNet: Captioning With Adaptive Attention on Visual and Non-Visual Words

1 code implementation CVPR 2021 Xuying Zhang, Xiaoshuai Sun, Yunpeng Luo, Jiayi Ji, Yiyi Zhou, Yongjian Wu, Feiyue Huang, Rongrong Ji

Then, we build a BERTbased language model to extract language context and propose Adaptive-Attention (AA) module on top of a transformer decoder to adaptively measure the contribution of visual and language cues before making decisions for word prediction.

Image Captioning Language Modelling +2

Variable selection with missing data in both covariates and outcomes: Imputation and machine learning

1 code implementation6 Apr 2021 Liangyuan Hu, Jung-Yi Joyce Lin, Jiayi Ji

We investigate a general variable selection approach when both the covariates and outcomes can be missing at random and have general missing data patterns.

BIG-bench Machine Learning Imputation +2

Dual-Level Collaborative Transformer for Image Captioning

1 code implementation16 Jan 2021 Yunpeng Luo, Jiayi Ji, Xiaoshuai Sun, Liujuan Cao, Yongjian Wu, Feiyue Huang, Chia-Wen Lin, Rongrong Ji

Descriptive region features extracted by object detection networks have played an important role in the recent advancements of image captioning.

Descriptive Image Captioning +2

Improving Image Captioning by Leveraging Intra- and Inter-layer Global Representation in Transformer Network

1 code implementation13 Dec 2020 Jiayi Ji, Yunpeng Luo, Xiaoshuai Sun, Fuhai Chen, Gen Luo, Yongjian Wu, Yue Gao, Rongrong Ji

The latter contains a Global Adaptive Controller that can adaptively fuse the global information into the decoder to guide the caption generation.

Caption Generation Image Captioning

Estimating heterogeneous survival treatment effect in observational data using machine learning

1 code implementation17 Aug 2020 Liangyuan Hu, Jiayi Ji, Fan Li

Methods for estimating heterogeneous treatment effect in observational data have largely focused on continuous or binary outcomes, and have been relatively less vetted with survival outcomes.

BIG-bench Machine Learning Causal Inference +1

Variational Structured Semantic Inference for Diverse Image Captioning

no code implementations NeurIPS 2019 Fuhai Chen, Rongrong Ji, Jiayi Ji, Xiaoshuai Sun, Baochang Zhang, Xuri Ge, Yongjian Wu, Feiyue Huang, Yan Wang

To model these two inherent diversities in image captioning, we propose a Variational Structured Semantic Inferring model (termed VSSI-cap) executed in a novel structured encoder-inferer-decoder schema.

Image Captioning

Semantic-aware Image Deblurring

no code implementations9 Oct 2019 Fuhai Chen, Rongrong Ji, Chengpeng Dai, Xiaoshuai Sun, Chia-Wen Lin, Jiayi Ji, Baochang Zhang, Feiyue Huang, Liujuan Cao

Specially, we propose a novel Structured-Spatial Semantic Embedding model for image deblurring (termed S3E-Deblur), which introduces a novel Structured-Spatial Semantic tree model (S3-tree) to bridge two basic tasks in computer vision: image deblurring (ImD) and image captioning (ImC).

Deblurring Image Captioning +1

Cannot find the paper you are looking for? You can Submit a new open access paper.