Search Results for author: Di Kang

Found 30 papers, 11 papers with code

Advances in 3D Generation: A Survey

no code implementations31 Jan 2024 Xiaoyu Li, Qi Zhang, Di Kang, Weihao Cheng, Yiming Gao, Jingbo Zhang, Zhihao Liang, Jing Liao, Yan-Pei Cao, Ying Shan

In this survey, we aim to introduce the fundamental methodologies of 3D generation methods and establish a structured roadmap, encompassing 3D representation, generation methods, datasets, and corresponding applications.

3D Generation Novel View Synthesis

TIP-Editor: An Accurate 3D Editor Following Both Text-Prompts And Image-Prompts

no code implementations26 Jan 2024 Jingyu Zhuang, Di Kang, Yan-Pei Cao, Guanbin Li, Liang Lin, Ying Shan

To this end, we propose a 3D scene editing framework, TIPEditor, that accepts both text and image prompts and a 3D bounding box to specify the editing region.

3D scene Editing

PHRIT: Parametric Hand Representation with Implicit Template

no code implementations ICCV 2023 Zhisheng Huang, Yujin Chen, Di Kang, Jinlu Zhang, Zhigang Tu

We propose PHRIT, a novel approach for parametric hand mesh modeling with an implicit template that combines the advantages of both parametric meshes and implicit representations.

3D Reconstruction Single-View 3D Reconstruction

Neural Point-based Volumetric Avatar: Surface-guided Neural Points for Efficient and Photorealistic Volumetric Head Avatar

no code implementations11 Jul 2023 Cong Wang, Di Kang, Yan-Pei Cao, Linchao Bao, Ying Shan, Song-Hai Zhang

Rendering photorealistic and dynamically moving human heads is crucial for ensuring a pleasant and immersive experience in AR/VR and video conferencing applications.

High-level Feature Guided Decoding for Semantic Segmentation

no code implementations15 Mar 2023 Ye Huang, Di Kang, Shenghua Gao, Wen Li, Lixin Duan

One crucial design of the HFG is to protect the high-level features from being contaminated by using proper stop-gradient operations so that the backbone does not update according to the noisy gradient from the upsampler.

Semantic Segmentation Vocal Bursts Intensity Prediction

Skinned Motion Retargeting with Residual Perception of Motion Semantics & Geometry

1 code implementation CVPR 2023 Jiaxu Zhang, Junwu Weng, Di Kang, Fang Zhao, Shaoli Huang, Xuefei Zhe, Linchao Bao, Ying Shan, Jue Wang, Zhigang Tu

Driven by our explored distance-based losses that explicitly model the motion semantics and geometry, these two modules can learn residual motion modifications on the source motion to generate plausible retargeted motion in a single inference without post-processing.

motion retargeting

Get3DHuman: Lifting StyleGAN-Human into a 3D Generative Model using Pixel-aligned Reconstruction Priors

no code implementations ICCV 2023 Zhangyang Xiong, Di Kang, Derong Jin, Weikai Chen, Linchao Bao, Shuguang Cui, Xiaoguang Han

Specifically, we bridge the latent space of Get3DHuman with that of StyleGAN-Human via a specially-designed prior network, where the input latent code is mapped to the shape and texture feature volumes spanned by the pixel-aligned 3D reconstructor.

Audio2Gestures: Generating Diverse Gestures from Audio

no code implementations17 Jan 2023 Jing Li, Di Kang, Wenjie Pei, Xuefei Zhe, Ying Zhang, Linchao Bao, Zhenyu He

Finally, we demonstrate that our method can be readily used to generate motion sequences with user-specified motion clips on the timeline.

Gesture Generation

Learning Audio-Driven Viseme Dynamics for 3D Face Animation

no code implementations15 Jan 2023 Linchao Bao, Haoxian Zhang, Yue Qian, Tangli Xue, Changhai Chen, Xuefei Zhe, Di Kang

We show that the predicted viseme curves can be applied to different viseme-rigged characters to yield various personalized animations with realistic and natural facial motions.

3D Face Animation

CARD: Semantic Segmentation with Efficient Class-Aware Regularized Decoder

1 code implementation11 Jan 2023 Ye Huang, Di Kang, Liang Chen, Wenjing Jia, Xiangjian He, Lixin Duan, Xuefei Zhe, Linchao Bao

Extensive experiments and ablation studies conducted on multiple benchmark datasets demonstrate that the proposed CAR can boost the accuracy of all baseline models by up to 2. 23% mIOU with superior generalization ability.

Representation Learning Semantic Segmentation +1

FFHQ-UV: Normalized Facial UV-Texture Dataset for 3D Face Reconstruction

1 code implementation CVPR 2023 Haoran Bai, Di Kang, Haoxian Zhang, Jinshan Pan, Linchao Bao

Our pipeline utilizes the recent advances in StyleGAN-based facial image editing approaches to generate multi-view normalized face images from single-image inputs.

3D Face Reconstruction

NEURAL MARIONETTE: A Transformer-based Multi-action Human Motion Synthesis System

no code implementations27 Sep 2022 Weiqiang Wang, Xuefei Zhe, Qiuhong Ke, Di Kang, Tingguang Li, Ruizhi Chen, Linchao Bao

Along with the novel system, we also present a new dataset dedicated to the multi-action motion synthesis task, which contains both action tags and their contextual information.

Motion Synthesis Rolling Shutter Correction +1

Learning to Construct 3D Building Wireframes from 3D Line Clouds

1 code implementation25 Aug 2022 Yicheng Luo, Jing Ren, Xuefei Zhe, Di Kang, Yajing Xu, Peter Wonka, Linchao Bao

The network takes a line cloud as input , i. e., a nonstructural and unordered set of 3D line segments extracted from multi-view images, and outputs a 3D wireframe of the underlying building, which consists of a sparse set of 3D junctions connected by line segments.

Semi-signed prioritized neural fitting for surface reconstruction from unoriented point clouds

no code implementations14 Jun 2022 Runsong Zhu, Di Kang, Ka-Hei Hui, Yue Qian, Xuefei Zhe, Zhen Dong, Linchao Bao, Pheng-Ann Heng, Chi-Wing Fu

To guide the network quickly fit the coarse shape, we propose to utilize the signed supervision in regions that are obviously outside the object and can be easily determined, resulting in our semi-signed supervision.

Surface Reconstruction

REALY: Rethinking the Evaluation of 3D Face Reconstruction

1 code implementation18 Mar 2022 Zenghao Chai, Haoxian Zhang, Jing Ren, Di Kang, Zhengzhuo Xu, Xuefei Zhe, Chun Yuan, Linchao Bao

The evaluation of 3D face reconstruction results typically relies on a rigid shape alignment between the estimated 3D model and the ground-truth scan.

3D Face Reconstruction

CAR: Class-aware Regularizations for Semantic Segmentation

1 code implementation arXiv:2203.07160 2022 Ye Huang, Di Kang, Liang Chen, Xuefei Zhe, Wenjing Jia, Xiangjian He, Linchao Bao

Recent segmentation methods, such as OCR and CPNet, utilizing "class level" information in addition to pixel features, have achieved notable success for boosting the accuracy of existing network modules.

Representation Learning Semantic Segmentation

NeRFReN: Neural Radiance Fields with Reflections

no code implementations CVPR 2022 Yuan-Chen Guo, Di Kang, Linchao Bao, Yu He, Song-Hai Zhang

Specifically, we propose to split a scene into transmitted and reflected components, and model the two components with separate neural radiance fields.

Depth Estimation Novel View Synthesis

Audio2Gestures: Generating Diverse Gestures from Speech Audio with Conditional Variational Autoencoders

no code implementations ICCV 2021 Jing Li, Di Kang, Wenjie Pei, Xuefei Zhe, Ying Zhang, Zhenyu He, Linchao Bao

In order to overcome this problem, we propose a novel conditional variational autoencoder (VAE) that explicitly models one-to-many audio-to-motion mapping by splitting the cross-modal latent code into shared code and motion-specific code.

Gesture Generation

Animatable Neural Radiance Fields from Monocular RGB Videos

1 code implementation25 Jun 2021 Jianchuan Chen, Ying Zhang, Di Kang, Xuefei Zhe, Linchao Bao, Xu Jia, Huchuan Lu

We present animatable neural radiance fields (animatable NeRF) for detailed human avatar creation from monocular videos.

3D Human Reconstruction Neural Rendering +2

Model-based 3D Hand Reconstruction via Self-Supervised Learning

1 code implementation CVPR 2021 Yujin Chen, Zhigang Tu, Di Kang, Linchao Bao, Ying Zhang, Xuefei Zhe, Ruizhi Chen, Junsong Yuan

For the first time, we demonstrate the feasibility of training an accurate 3D hand reconstruction network without relying on manual annotations.

Self-Supervised Learning

Channelized Axial Attention for Semantic Segmentation -- Considering Channel Relation within Spatial Attention for Semantic Segmentation

1 code implementation19 Jan 2021 Ye Huang, Di Kang, Wenjing Jia, Xiangjian He, Liu Liu

Spatial and channel attentions, modelling the semantic interdependencies in spatial and channel dimensions respectively, have recently been widely used for semantic segmentation.

Relation Segmentation +1

High-Fidelity 3D Digital Human Head Creation from RGB-D Selfies

2 code implementations12 Oct 2020 Linchao Bao, Xiangkai Lin, Yajing Chen, Haoxian Zhang, Sheng Wang, Xuefei Zhe, Di Kang, HaoZhi Huang, Xinwei Jiang, Jue Wang, Dong Yu, Zhengyou Zhang

We present a fully automatic system that can produce high-fidelity, photo-realistic 3D digital human heads with a consumer RGB-D selfie camera.

Vocal Bursts Intensity Prediction

Fusing Crowd Density Maps and Visual Object Trackers for People Tracking in Crowd Scenes

no code implementations CVPR 2018 Weihong Ren, Di Kang, Yandong Tang, Antoni B. Chan

While people tracking has been greatly improved over the recent years, crowd scenes remain particularly challenging for people tracking due to heavy occlusions, high crowd density, and significant appearance variation.

Crowd Counting by Adaptively Fusing Predictions from an Image Pyramid

no code implementations16 May 2018 Di Kang, Antoni Chan

In this paper, in contrast to using filters of different sizes, we utilize an image pyramid to deal with scale variations.

Crowd Counting

Incorporating Side Information by Adaptive Convolution

no code implementations NeurIPS 2017 Di Kang, Debarun Dhar, Antoni Chan

For example, for crowd counting, the camera perspective (e. g., camera angle and height) gives a clue about the appearance and scale of people in the scene.

Crowd Counting Deblurring +1

Beyond Counting: Comparisons of Density Maps for Crowd Analysis Tasks - Counting, Detection, and Tracking

1 code implementation29 May 2017 Di Kang, Zheng Ma, Antoni B. Chan

The goal of this paper is to evaluate density maps generated by density estimation methods on a variety of crowd analysis tasks, including counting, detection, and tracking.

Density Estimation regression

Crowd Counting by Adapting Convolutional Neural Networks with Side Information

no code implementations21 Nov 2016 Di Kang, Debarun Dhar, Antoni B. Chan

In order to incorporate the available side information, we propose an adaptive convolutional neural network (ACNN), where the convolutional filter weights adapt to the current scene context via the side information.

Crowd Counting Image Deconvolution

Cannot find the paper you are looking for? You can Submit a new open access paper.