no code implementations • COLING 2022 • Changzhi Wang, Xiaodong Gu
Specifically, different from the previous relationship based approaches that only explore the single relationship in the image, our JRAN can effectively learn two relationships, the visual relationships among region features and the visual-semantic relationships between region features and semantic features, and further make a dynamic trade-off between them during outputting the relationship representation.
no code implementations • 22 Mar 2024 • Zhengyi Zhao, Chen Song, Xiaodong Gu, Yuan Dong, Qi Zuo, Weihao Yuan, Zilong Dong, Liefeng Bo, QiXing Huang
In particular, the third and fourth stages are iterated, with the cuts obtained in the fourth stage encouraging non-rigid alignment in the third stage to focus on regions close to the cuts.
no code implementations • 18 Mar 2024 • Qi Zuo, Xiaodong Gu, Lingteng Qiu, Yuan Dong, Zhengyi Zhao, Weihao Yuan, Rui Peng, Siyu Zhu, Zilong Dong, Liefeng Bo, QiXing Huang
Images from video generative models are more suitable for multi-view generation because the underlying network architecture that generates them employs a temporal module to enforce frame consistency.
1 code implementation • 12 Jan 2024 • Yuling Shi, Hongyu Zhang, Chengcheng Wan, Xiaodong Gu
Based on our findings, we propose DetectCodeGPT, a novel method for detecting machine-generated code, which improves DetectGPT by capturing the distinct stylized patterns of code.
no code implementations • 28 Nov 2023 • Lingteng Qiu, GuanYing Chen, Xiaodong Gu, Qi Zuo, Mutian Xu, Yushuang Wu, Weihao Yuan, Zilong Dong, Liefeng Bo, Xiaoguang Han
Lifting 2D diffusion for 3D generation is a challenging problem due to the lack of geometric prior and the complex entanglement of materials and lighting in natural images.
1 code implementation • 8 Aug 2023 • Shuai Zhang, Xiaodong Gu, Yuting Chen, Beijun Shen
Particularly, InfeRE outperforms the popular tree-based generation approach by 18. 1% and 11. 3% on both datasets, respectively, in terms of DFA@5 accuracy.
1 code implementation • 15 Mar 2023 • Youcai Zhang, Yuzhuo Qin, Hengwei Liu, Yanhao Zhang, Yaqian Li, Xiaodong Gu
Knowledge distillation (KD) has been extensively studied in single-label image classification.
no code implementations • 31 Jan 2023 • Weihao Yuan, Xiaodong Gu, Heng Li, Zilong Dong, Siyu Zhu
In this work, we propose an SDF transformer network, which replaces the role of 3D CNN for better 3D feature aggregation.
no code implementations • 21 Jan 2023 • Heng Li, Xiaodong Gu, Weihao Yuan, Luwei Yang, Zilong Dong, Ping Tan
To reach this challenging goal without depth input, we introduce a hierarchical feature volume to facilitate the implicit map decoder.
no code implementations • 18 Nov 2022 • Haoren Zhu, Hao Ge, Xiaodong Gu, Pengfei Zhao, Dik Lun Lee
Traditional recommender systems are typically passive in that they try to adapt their recommendations to the user's historical interests.
1 code implementation • COLING 2022 • Xiaodong Gu, Zhaowei Zhang, Sang-Woo Lee, Kang Min Yoo, Jung-Woo Ha
While Transformers have had significant success in paragraph generation, they treat sentences as linear sequences of tokens and often neglect their hierarchical information.
no code implementations • 20 Aug 2022 • Qingrong Cheng, Keyu Wen, Xiaodong Gu
To address this issue, we propose a novel Vision-Language Matching strategy for text-to-image synthesis, named VLMGAN*, which introduces a dual vision-language matching mechanism to strengthen the image quality and semantic consistency.
no code implementations • 13 Aug 2022 • Zhenshan Tan, Cheng Chen, Keyu Wen, Yuzhuo Qin, Xiaodong Gu
With the design of negative samples, the noise objects are suppressed.
no code implementations • 2 Jul 2022 • Keyu Wen, Zhenshan Tan, Qingrong Cheng, Cheng Chen, Xiaodong Gu
Concretely, the first module is a weight-sharing transformer that builds on the head of the visual and textual encoders, aiming to semantically align text and image.
no code implementations • 23 May 2022 • Xiaodong Gu, Chengzhou Tang, Weihao Yuan, Zuozhuo Dai, Siyu Zhu, Ping Tan
In the experiments, we evaluate the proposed method on both the 3D scene flow estimation and the point cloud registration task.
no code implementations • CVPR 2022 • Cheng Chen, Yudong Zhu, Zhenshan Tan, Qingrong Cheng, Xin Jiang, Qun Liu, Xiaodong Gu
In this paper, we propose a contrastive learning-based framework UTC to unify and facilitate both discriminative and generative tasks in visual dialog with a single model.
1 code implementation • CVPR 2022 • Weihao Yuan, Xiaodong Gu, Zuozhuo Dai, Siyu Zhu, Ping Tan
While recent works design increasingly complicated and powerful networks to directly regress the depth map, we take the path of CRFs optimization.
Ranked #1 on Depth Prediction on Matterport3D
no code implementations • CVPR 2022 • Weihao Yuan, Xiaodong Gu, Zuozhuo Dai, Siyu Zhu, Ping Tan
Estimating the accurate depth from a single image is challenging since it is inherently ambiguous and ill-posed.
1 code implementation • CVPR 2022 • Xiaodong Gu, Chengzhou Tang, Weihao Yuan, Zuozhuo Dai, Siyu Zhu, Ping Tan
In the experiments, we evaluate the proposed method on both the 3D scene flow estimation and the point cloud registration task.
no code implementations • 4 Nov 2021 • Xiaodong Gu, Kang Min Yoo, Sang-Woo Lee
Pre-trained language models (PLM) have marked a huge leap in neural dialogue modeling.
1 code implementation • 24 Mar 2021 • Xiaodong Gu, Weihao Yuan, Zuozhuo Dai, Siyu Zhu, Chengzhou Tang, Zilong Dong, Ping Tan
There are increasing interests of studying the video-to-depth (V2D) problem with machine learning techniques.
1 code implementation • 3 Dec 2020 • Xiaodong Gu, Kang Min Yoo, Jung-Woo Ha
Recent advances in pre-trained language models have significantly improved neural response generation.
1 code implementation • 22 Oct 2020 • Keyu Wen, Xiaodong Gu, Qingrong Cheng
Thus, a novel multi-level semantic relations enhancement approach named Dual Semantic Relations Attention Network(DSRAN) is proposed which mainly consists of two modules, separate semantic relations module and the joint semantic relations module.
4 code implementations • CVPR 2020 • Xiaodong Gu, Zhiwen Fan, Zuozhuo Dai, Siyu Zhu, Feitong Tan, Ping Tan
The deep multi-view stereo (MVS) and stereo matching approaches generally construct 3D cost volumes to regularize and regress the output depth or disparity.
Ranked #12 on Point Clouds on Tanks and Temples
no code implementations • 2 Jul 2019 • Ruizheng Wu, Xiaodong Gu, Xin Tao, Xiaoyong Shen, Yu-Wing Tai, Jiaya Jia
In this paper, we are interested in generating an cartoon face of a person by using unpaired training data between real faces and cartoon ones.
1 code implementation • ICCV 2019 • Ruizheng Wu, Xin Tao, Xiaodong Gu, Xiaoyong Shen, Jiaya Jia
Current image translation methods, albeit effective to produce high-quality results in various applications, still do not consider much geometric transform.
no code implementations • ICLR 2019 • Yuetong Du, Xiaodong Gu, Junqin Liu, Liwen He
For semantic image segmentation and lane detection, nets with a single spatial pyramid structure or encoder-decoder structure are usually exploited.
5 code implementations • ICCV 2019 • Zuozhuo Dai, Mingqiang Chen, Xiaodong Gu, Siyu Zhu, Ping Tan
In this paper, we propose the Batch DropBlock (BDB) Network which is a two branch network composed of a conventional ResNet-50 as the global branch and a feature dropping branch.
Ranked #8 on Person Re-Identification on Market-1501-C
3 code implementations • ICLR 2019 • Xiaodong Gu, Kyunghyun Cho, Jung-Woo Ha, Sunghun Kim
Variational autoencoders~(VAEs) have shown a promise in data-driven conversation modeling.
no code implementations • 25 Apr 2017 • Xiaodong Gu, Hongyu Zhang, Dongmei Zhang, Sunghun Kim
They rely on the sparse availability of bilingual projects, thus producing a limited number of API mappings.
no code implementations • 27 May 2016 • Xiaodong Gu, Hongyu Zhang, Dongmei Zhang, Sunghun Kim
We propose DeepAPI, a deep learning based approach to generate API usage sequences for a given natural language query.
no code implementations • 4 Dec 2015 • Haibing Wu, Xiaodong Gu
Recently, dropout has seen increasing use in deep learning.
no code implementations • 1 Dec 2015 • Haibing Wu, Xiaodong Gu
However, its effect in convolutional and pooling layers is still not clear.
no code implementations • 30 Nov 2015 • Haibing Wu, Yiwei Gu, Shangdi Sun, Xiaodong Gu
To tackle aspect mapping and sentiment classification, we propose two Convolutional Neural Network (CNN) based methods, cascaded CNN and multitask CNN.