Search Results for author: Xiaodong Gu

Found 35 papers, 13 papers with code

Building Joint Relationship Attention Network for Image-Text Generation

no code implementations COLING 2022 Changzhi Wang, Xiaodong Gu

Specifically, different from the previous relationship based approaches that only explore the single relationship in the image, our JRAN can effectively learn two relationships, the visual relationships among region features and the visual-semantic relationships between region features and semantic features, and further make a dynamic trade-off between them during outputting the relationship representation.

Text Generation

An Optimization Framework to Enforce Multi-View Consistency for Texturing 3D Meshes Using Pre-Trained Text-to-Image Models

no code implementations22 Mar 2024 Zhengyi Zhao, Chen Song, Xiaodong Gu, Yuan Dong, Qi Zuo, Weihao Yuan, Zilong Dong, Liefeng Bo, QiXing Huang

In particular, the third and fourth stages are iterated, with the cuts obtained in the fourth stage encouraging non-rigid alignment in the third stage to focus on regions close to the cuts.

VideoMV: Consistent Multi-View Generation Based on Large Video Generative Model

no code implementations18 Mar 2024 Qi Zuo, Xiaodong Gu, Lingteng Qiu, Yuan Dong, Zhengyi Zhao, Weihao Yuan, Rui Peng, Siyu Zhu, Zilong Dong, Liefeng Bo, QiXing Huang

Images from video generative models are more suitable for multi-view generation because the underlying network architecture that generates them employs a temporal module to enforce frame consistency.

Denoising

Between Lines of Code: Unraveling the Distinct Patterns of Machine and Human Programmers

1 code implementation12 Jan 2024 Yuling Shi, Hongyu Zhang, Chengcheng Wan, Xiaodong Gu

Based on our findings, we propose DetectCodeGPT, a novel method for detecting machine-generated code, which improves DetectGPT by capturing the distinct stylized patterns of code.

Code Generation

RichDreamer: A Generalizable Normal-Depth Diffusion Model for Detail Richness in Text-to-3D

no code implementations28 Nov 2023 Lingteng Qiu, GuanYing Chen, Xiaodong Gu, Qi Zuo, Mutian Xu, Yushuang Wu, Weihao Yuan, Zilong Dong, Liefeng Bo, Xiaoguang Han

Lifting 2D diffusion for 3D generation is a challenging problem due to the lack of geometric prior and the complex entanglement of materials and lighting in natural images.

Text to 3D

InfeRE: Step-by-Step Regex Generation via Chain of Inference

1 code implementation8 Aug 2023 Shuai Zhang, Xiaodong Gu, Yuting Chen, Beijun Shen

Particularly, InfeRE outperforms the popular tree-based generation approach by 18. 1% and 11. 3% on both datasets, respectively, in terms of DFA@5 accuracy.

Text Matching

3D Former: Monocular Scene Reconstruction with 3D SDF Transformers

no code implementations31 Jan 2023 Weihao Yuan, Xiaodong Gu, Heng Li, Zilong Dong, Siyu Zhu

In this work, we propose an SDF transformer network, which replaces the role of 3D CNN for better 3D feature aggregation.

Dense RGB SLAM with Neural Implicit Maps

no code implementations21 Jan 2023 Heng Li, Xiaodong Gu, Weihao Yuan, Luwei Yang, Zilong Dong, Ping Tan

To reach this challenging goal without depth input, we introduce a hierarchical feature volume to facilitate the implicit map decoder.

Simultaneous Localization and Mapping

Influential Recommender System

no code implementations18 Nov 2022 Haoren Zhu, Hao Ge, Xiaodong Gu, Pengfei Zhao, Dik Lun Lee

Traditional recommender systems are typically passive in that they try to adapt their recommendations to the user's historical interests.

Recommendation Systems

Continuous Decomposition of Granularity for Neural Paraphrase Generation

1 code implementation COLING 2022 Xiaodong Gu, Zhaowei Zhang, Sang-Woo Lee, Kang Min Yoo, Jung-Woo Ha

While Transformers have had significant success in paragraph generation, they treat sentences as linear sequences of tokens and often neglect their hierarchical information.

Paraphrase Generation Sentence

Vision-Language Matching for Text-to-Image Synthesis via Generative Adversarial Networks

no code implementations20 Aug 2022 Qingrong Cheng, Keyu Wen, Xiaodong Gu

To address this issue, we propose a novel Vision-Language Matching strategy for text-to-image synthesis, named VLMGAN*, which introduces a dual vision-language matching mechanism to strengthen the image quality and semantic consistency.

Image Generation

Contrastive Cross-Modal Knowledge Sharing Pre-training for Vision-Language Representation Learning and Retrieval

no code implementations2 Jul 2022 Keyu Wen, Zhenshan Tan, Qingrong Cheng, Cheng Chen, Xiaodong Gu

Concretely, the first module is a weight-sharing transformer that builds on the head of the visual and textual encoders, aiming to semantically align text and image.

Contrastive Learning Cross-Modal Retrieval +5

RCP: Recurrent Closest Point for Scene Flow Estimation on 3D Point Clouds

no code implementations23 May 2022 Xiaodong Gu, Chengzhou Tang, Weihao Yuan, Zuozhuo Dai, Siyu Zhu, Ping Tan

In the experiments, we evaluate the proposed method on both the 3D scene flow estimation and the point cloud registration task.

Motion Estimation Point Cloud Registration +1

UTC: A Unified Transformer with Inter-Task Contrastive Learning for Visual Dialog

no code implementations CVPR 2022 Cheng Chen, Yudong Zhu, Zhenshan Tan, Qingrong Cheng, Xin Jiang, Qun Liu, Xiaodong Gu

In this paper, we propose a contrastive learning-based framework UTC to unify and facilitate both discriminative and generative tasks in visual dialog with a single model.

Contrastive Learning Representation Learning +1

NeW CRFs: Neural Window Fully-connected CRFs for Monocular Depth Estimation

1 code implementation CVPR 2022 Weihao Yuan, Xiaodong Gu, Zuozhuo Dai, Siyu Zhu, Ping Tan

While recent works design increasingly complicated and powerful networks to directly regress the depth map, we take the path of CRFs optimization.

Depth Prediction Monocular Depth Estimation

Neural Window Fully-Connected CRFs for Monocular Depth Estimation

no code implementations CVPR 2022 Weihao Yuan, Xiaodong Gu, Zuozhuo Dai, Siyu Zhu, Ping Tan

Estimating the accurate depth from a single image is challenging since it is inherently ambiguous and ill-posed.

Monocular Depth Estimation

RCP: Recurrent Closest Point for Point Cloud

1 code implementation CVPR 2022 Xiaodong Gu, Chengzhou Tang, Weihao Yuan, Zuozhuo Dai, Siyu Zhu, Ping Tan

In the experiments, we evaluate the proposed method on both the 3D scene flow estimation and the point cloud registration task.

Motion Estimation Point Cloud Registration +1

DRO: Deep Recurrent Optimizer for Video to Depth

1 code implementation24 Mar 2021 Xiaodong Gu, Weihao Yuan, Zuozhuo Dai, Siyu Zhu, Chengzhou Tang, Zilong Dong, Ping Tan

There are increasing interests of studying the video-to-depth (V2D) problem with machine learning techniques.

Learning Dual Semantic Relations with Graph Attention for Image-Text Matching

1 code implementation22 Oct 2020 Keyu Wen, Xiaodong Gu, Qingrong Cheng

Thus, a novel multi-level semantic relations enhancement approach named Dual Semantic Relations Attention Network(DSRAN) is proposed which mainly consists of two modules, separate semantic relations module and the joint semantic relations module.

Graph Attention Image-text matching +1

Cascade Cost Volume for High-Resolution Multi-View Stereo and Stereo Matching

4 code implementations CVPR 2020 Xiaodong Gu, Zhiwen Fan, Zuozhuo Dai, Siyu Zhu, Feitong Tan, Ping Tan

The deep multi-view stereo (MVS) and stereo matching approaches generally construct 3D cost volumes to regularize and regress the output depth or disparity.

3D Reconstruction Point Clouds +1

Landmark Assisted CycleGAN for Cartoon Face Generation

no code implementations2 Jul 2019 Ruizheng Wu, Xiaodong Gu, Xin Tao, Xiaoyong Shen, Yu-Wing Tai, Jiaya Jia

In this paper, we are interested in generating an cartoon face of a person by using unpaired training data between real faces and cartoon ones.

Face Generation

Attribute-Driven Spontaneous Motion in Unpaired Image Translation

1 code implementation ICCV 2019 Ruizheng Wu, Xin Tao, Xiaodong Gu, Xiaoyong Shen, Jiaya Jia

Current image translation methods, albeit effective to produce high-quality results in various applications, still do not consider much geometric transform.

Attribute Motion Estimation +1

Multiple Encoder-Decoders Net for Lane Detection

no code implementations ICLR 2019 Yuetong Du, Xiaodong Gu, Junqin Liu, Liwen He

For semantic image segmentation and lane detection, nets with a single spatial pyramid structure or encoder-decoder structure are usually exploited.

Image Segmentation Lane Detection +1

Batch DropBlock Network for Person Re-identification and Beyond

5 code implementations ICCV 2019 Zuozhuo Dai, Mingqiang Chen, Xiaodong Gu, Siyu Zhu, Ping Tan

In this paper, we propose the Batch DropBlock (BDB) Network which is a two branch network composed of a conventional ResNet-50 as the global branch and a feature dropping branch.

Image Retrieval Metric Learning +1

DeepAM: Migrate APIs with Multi-modal Sequence to Sequence Learning

no code implementations25 Apr 2017 Xiaodong Gu, Hongyu Zhang, Dongmei Zhang, Sunghun Kim

They rely on the sparse availability of bilingual projects, thus producing a limited number of API mappings.

Deep API Learning

no code implementations27 May 2016 Xiaodong Gu, Hongyu Zhang, Dongmei Zhang, Sunghun Kim

We propose DeepAPI, a deep learning based approach to generate API usage sequences for a given natural language query.

Information Retrieval Language Modelling +2

Towards Dropout Training for Convolutional Neural Networks

no code implementations1 Dec 2015 Haibing Wu, Xiaodong Gu

However, its effect in convolutional and pooling layers is still not clear.

Data Augmentation

Aspect-based Opinion Summarization with Convolutional Neural Networks

no code implementations30 Nov 2015 Haibing Wu, Yiwei Gu, Shangdi Sun, Xiaodong Gu

To tackle aspect mapping and sentiment classification, we propose two Convolutional Neural Network (CNN) based methods, cascaded CNN and multitask CNN.

Aspect Extraction Classification +6

Cannot find the paper you are looking for? You can Submit a new open access paper.