Search Results for author: Xiaokang Chen

Found 26 papers, 16 papers with code

InTeX: Interactive Text-to-texture Synthesis via Unified Depth-aware Inpainting

no code implementations • 18 Mar 2024 • Jiaxiang Tang, Ruijie Lu, Xiaokang Chen, Xiang Wen, Gang Zeng, Ziwei Liu

Text-to-texture synthesis has become a new frontier in 3D content creation thanks to the recent advances in text-to-image models.

Texture Synthesis

Paper
Add Code

LGM: Large Multi-View Gaussian Model for High-Resolution 3D Content Creation

1 code implementation • 7 Feb 2024 • Jiaxiang Tang, Zhaoxi Chen, Xiaokang Chen, Tengfei Wang, Gang Zeng, Ziwei Liu

2) 3D Backbone: We present an asymmetric U-Net as a high-throughput backbone operating on multi-view images, which can be produced from text or single-view image input by leveraging multi-view diffusion models.

1,161

Paper
Code

Uncovering and Categorizing Social Biases in Text-to-SQL

1 code implementation • 25 May 2023 • Yan Liu, Yan Gao, Zhe Su, Xiaokang Chen, Elliott Ash, Jian-Guang Lou

In this work, we aim to uncover and categorize social biases in Text-to-SQL models.

Text-To-SQL

Paper
Code

Interactive Segment Anything NeRF with Feature Imitation

no code implementations • 25 May 2023 • Xiaokang Chen, Jiaxiang Tang, Diwen Wan, Jingbo Wang, Gang Zeng

We propose to imitate the backbone feature of off-the-shelf perception models to achieve zero-shot semantic segmentation with NeRF.

Segmentation Semantic Segmentation +1

Paper
Add Code

VisionLLM: Large Language Model is also an Open-Ended Decoder for Vision-Centric Tasks

2 code implementations • NeurIPS 2023 • Wenhai Wang, Zhe Chen, Xiaokang Chen, Jiannan Wu, Xizhou Zhu, Gang Zeng, Ping Luo, Tong Lu, Jie zhou, Yu Qiao, Jifeng Dai

We hope this model can set a new baseline for generalist vision and language models.

Language Modelling Large Language Model

3,122

Paper
Code

Real-time 3D Semantic Scene Completion Via Feature Aggregation and Conditioned Prediction

no code implementations • 20 Mar 2023 • Xiaokang Chen, Yajie Xing, Gang Zeng

In this paper, we propose a real-time semantic scene completion method with a feature aggregation strategy and conditioned prediction module.

3D Semantic Scene Completion

Paper
Add Code

Delicate Textured Mesh Recovery from NeRF via Adaptive Surface Refinement

1 code implementation • ICCV 2023 • Jiaxiang Tang, Hang Zhou, Xiaokang Chen, Tianshu Hu, Errui Ding, Jingdong Wang, Gang Zeng

Neural Radiance Fields (NeRF) have constituted a remarkable breakthrough in image-based 3D reconstruction.

3D Reconstruction

848

Paper
Code

Parallel Sentence-Level Explanation Generation for Real-World Low-Resource Scenarios

no code implementations • 21 Feb 2023 • Yan Liu, Xiaokang Chen, Qi Dai

However, current works pursuing sentence-level explanations rely heavily on annotated training data, which limits the development of interpretability to only a few tasks.

Explanation Generation Natural Language Inference +1

Paper
Add Code

Understanding Self-Supervised Pretraining with Part-Aware Representation Learning

1 code implementation • 27 Jan 2023 • Jie Zhu, Jiyang Qi, Mingyu Ding, Xiaokang Chen, Ping Luo, Xinggang Wang, Wenyu Liu, Leye Wang, Jingdong Wang

The study is mainly motivated by that random views, used in contrastive learning, and random masked (visible) patches, used in masked image modeling, are often about object parts.

Contrastive Learning Object +1

Paper
Code

Real-time Neural Radiance Talking Portrait Synthesis via Audio-spatial Decomposition

1 code implementation • 22 Nov 2022 • Jiaxiang Tang, Kaisiyuan Wang, Hang Zhou, Xiaokang Chen, Dongliang He, Tianshu Hu, Jingtuo Liu, Gang Zeng, Jingdong Wang

While dynamic Neural Radiance Fields (NeRF) have shown success in high-fidelity 3D modeling of talking portraits, the slow training and inference speed severely obstruct their potential usage.

Talking Face Generation

820

Paper
Code

CAE v2: Context Autoencoder with CLIP Target

no code implementations • 17 Nov 2022 • Xinyu Zhang, Jiahui Chen, Junkun Yuan, Qiang Chen, Jian Wang, Xiaodi Wang, Shumin Han, Xiaokang Chen, Jimin Pi, Kun Yao, Junyu Han, Errui Ding, Jingdong Wang

That is to say, the smaller the model, the lower the mask ratio needs to be.

Semantic Segmentation

Paper
Add Code

D$^3$ETR: Decoder Distillation for Detection Transformer

no code implementations • 17 Nov 2022 • Xiaokang Chen, Jiahui Chen, Yan Liu, Gang Zeng

Specifically, Adaptive Matching applies bipartite matching to adaptively match the outputs of the teacher and the student in each decoder layer, while Fixed Matching fixes the correspondence between the outputs of the teacher and the student with the same object queries, with the teacher's fixed object queries fed to the decoder of the student as an auxiliary group.

Knowledge Distillation

Paper
Add Code

Group DETR v2: Strong Object Detector with Encoder-Decoder Pretraining

no code implementations • arXiv 2022 • Qiang Chen, Jian Wang, Chuchu Han, Shan Zhang, Zexian Li, Xiaokang Chen, Jiahui Chen, Xiaodi Wang, Shuming Han, Gang Zhang, Haocheng Feng, Kun Yao, Junyu Han, Errui Ding, Jingdong Wang

The training process consists of self-supervised pretraining and finetuning a ViT-Huge encoder on ImageNet-1K, pretraining the detector on Object365, and finally finetuning it on COCO.

Ranked #8 on Object Detection on COCO test-dev

Object object-detection +1

Paper
Add Code

Group DETR: Fast DETR Training with Group-Wise One-to-Many Assignment

2 code implementations • ICCV 2023 • Qiang Chen, Xiaokang Chen, Jian Wang, Shan Zhang, Kun Yao, Haocheng Feng, Junyu Han, Errui Ding, Gang Zeng, Jingdong Wang

Detection transformer (DETR) relies on one-to-one assignment, assigning one ground-truth object to one prediction, for end-to-end detection without NMS post-processing.

Data Augmentation Object +2

12,069

Paper
Code

Conditional DETR V2: Efficient Detection Transformer with Box Queries

no code implementations • 18 Jul 2022 • Xiaokang Chen, Fangyun Wei, Gang Zeng, Jingdong Wang

Inspired by Conditional DETR, an improved DETR with fast training convergence, that presented box queries (originally called spatial queries) for internal decoder layers, we reformulate the object query into the format of the box query that is a composition of the embeddings of the reference point and the transformation of the box with respect to the reference point.

Object object-detection +1

Paper
Add Code

Compressible-composable NeRF via Rank-residual Decomposition

2 code implementations • 30 May 2022 • Jiaxiang Tang, Xiaokang Chen, Jingbo Wang, Gang Zeng

To circumvent the hurdle, in this paper, we present an explicit neural field representation that enables efficient and convenient manipulation of models.

2,003

Paper
Code

Point Scene Understanding via Disentangled Instance Mesh Reconstruction

1 code implementation • 31 Mar 2022 • Jiaxiang Tang, Xiaokang Chen, Jingbo Wang, Gang Zeng

Semantic scene reconstruction from point cloud is an essential and challenging task for 3D scene understanding.

Retrieval Scene Understanding

Paper
Code

MaskGroup: Hierarchical Point Grouping and Masking for 3D Instance Segmentation

no code implementations • 28 Mar 2022 • Min Zhong, Xinghao Chen, Xiaokang Chen, Gang Zeng, Yunhe Wang

For instance, our approach achieves a 66. 4\% mAP with the 0. 5 IoU threshold on the ScanNetV2 test set, which is 1. 9\% higher than the state-of-the-art method.

Ranked #6 on 3D Instance Segmentation on S3DIS

3D Instance Segmentation Semantic Segmentation

Paper
Add Code

Context Autoencoder for Self-Supervised Representation Learning

6 code implementations • 7 Feb 2022 • Xiaokang Chen, Mingyu Ding, Xiaodi Wang, Ying Xin, Shentong Mo, Yunhao Wang, Shumin Han, Ping Luo, Gang Zeng, Jingdong Wang

The pretraining tasks include two tasks: masked representation prediction - predict the representations for the masked patches, and masked patch reconstruction - reconstruct the masked patches.

Ranked #14 on Self-Supervised Image Classification on ImageNet (finetuned)

Instance Segmentation object-detection +5

3,084

Paper
Code

Not All Voxels Are Equal: Semantic Scene Completion from the Point-Voxel Perspective

no code implementations • 24 Dec 2021 • Xiaokang Chen, Jiaxiang Tang, Jingbo Wang, Gang Zeng

Firstly, we transfer the voxelized scenes to point clouds by removing these visible empty voxels and adopt a deep point stream to capture semantic information from the scene efficiently.

Ranked #4 on 3D Semantic Scene Completion on NYUv2

3D Semantic Scene Completion

Paper
Add Code

Conditional DETR for Fast Training Convergence

3 code implementations • ICCV 2021 • Depu Meng, Xiaokang Chen, Zejia Fan, Gang Zeng, Houqiang Li, Yuhui Yuan, Lei Sun, Jingdong Wang

Our approach, named conditional DETR, learns a conditional spatial query from the decoder embedding for decoder multi-head cross-attention.

Object object-detection +1

125,118

Paper
Code

Joint Implicit Image Function for Guided Depth Super-Resolution

1 code implementation • 19 Jul 2021 • Jiaxiang Tang, Xiaokang Chen, Gang Zeng

Inspired by the recent progress in implicit neural representation, we propose to formulate the guided super-resolution as a neural implicit image interpolation problem, where we take the form of a general image interpolation but use a novel Joint Implicit Image Function (JIIF) representation to learn both the interpolation weights and values.

Graph Attention Super-Resolution

Paper
Code

Semi-Supervised Semantic Segmentation with Cross Pseudo Supervision

3 code implementations • CVPR 2021 • Xiaokang Chen, Yuhui Yuan, Gang Zeng, Jingdong Wang

Our approach imposes the consistency on two segmentation networks perturbed with different initialization for the same input image.

Ranked #2 on Semi-Supervised Semantic Segmentation on WoodScape

Segmentation Semi-Supervised Semantic Segmentation

475

Paper
Code

Bi-directional Cross-Modality Feature Propagation with Separation-and-Aggregation Gate for RGB-D Semantic Segmentation

2 code implementations • ECCV 2020 • Xiaokang Chen, Kwan-Yee Lin, Jingbo Wang, Wayne Wu, Chen Qian, Hongsheng Li, Gang Zeng

Depth information has proven to be a useful cue in the semantic segmentation of RGB-D images for providing a geometric counterpart to the RGB representation.

Ranked #3 on Semantic Segmentation on Event-based Segmentation Dataset

Segmentation Semantic Segmentation +2

274

Paper
Code

3D Sketch-aware Semantic Scene Completion via Semi-supervised Structure Prior

2 code implementations • CVPR 2020 • Xiaokang Chen, Kwan-Yee Lin, Chen Qian, Gang Zeng, Hongsheng Li

To this end, we first propose a novel 3D sketch-aware feature embedding to explicitly encode geometric information effectively and efficiently.

Ranked #3 on 3D Semantic Scene Completion from a single RGB image on NYUv2

3D Semantic Scene Completion from a single RGB image Hallucination

Paper
Code

Segmentation Transformer: Object-Contextual Representations for Semantic Segmentation

11 code implementations • ECCV 2020 • Yuhui Yuan, Xiaokang Chen, Xilin Chen, Jingdong Wang

We empirically demonstrate that the proposed approach achieves competitive performance on various challenging semantic segmentation benchmarks: Cityscapes, ADE20K, LIP, PASCAL-Context, and COCO-Stuff.

Ranked #5 on Semantic Segmentation on LIP val

Object Segmentation +1

8,260

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.