Search Results for author: Jiahui Zhang

Found 32 papers, 16 papers with code

FreGS: 3D Gaussian Splatting with Progressive Frequency Regularization

no code implementations • 11 Mar 2024 • Jiahui Zhang, Fangneng Zhan, Muyu Xu, Shijian Lu, Eric Xing

3D Gaussian splatting has achieved very impressive performance in real-time novel view synthesis.

Paper
Add Code

WoVoGen: World Volume-aware Diffusion for Controllable Multi-camera Driving Scene Generation

1 code implementation • 5 Dec 2023 • Jiachen Lu, Ze Huang, Zeyu Yang, Jiahui Zhang, Li Zhang

Generating multi-camera street-view videos is critical for augmenting autonomous driving datasets, addressing the urgent demand for extensive and varied data.

Autonomous Driving Scene Generation +1

Paper
Code

Bootstrap Your Own Skills: Learning to Solve New Tasks with Large Language Model Guidance

no code implementations • 16 Oct 2023 • Jesse Zhang, Jiahui Zhang, Karl Pertsch, Ziyi Liu, Xiang Ren, Minsuk Chang, Shao-Hua Sun, Joseph J. Lim

Instead, our approach BOSS (BOotStrapping your own Skills) learns to accomplish new tasks by performing "skill bootstrapping," where an agent with a set of primitive skills interacts with the environment to practice new skills without receiving reward feedback for tasks outside of the initial skill set.

Language Modelling Large Language Model

Paper
Add Code

Pose-Free Neural Radiance Fields via Implicit Pose Regularization

no code implementations • ICCV 2023 • Jiahui Zhang, Fangneng Zhan, Yingchen Yu, Kunhao Liu, Rongliang Wu, Xiaoqin Zhang, Ling Shao, Shijian Lu

However, as the pose estimator is trained with only rendered images, the pose estimation is usually biased or inaccurate for real images due to the domain gap between real images and rendered images, leading to poor robustness for the pose estimation of real images and further local minima in joint optimization.

Novel View Synthesis Pose Estimation

Paper
Add Code

WaveNeRF: Wavelet-based Generalizable Neural Radiance Fields

no code implementations • ICCV 2023 • Muyu Xu, Fangneng Zhan, Jiahui Zhang, Yingchen Yu, Xiaoqin Zhang, Christian Theobalt, Ling Shao, Shijian Lu

Neural Radiance Field (NeRF) has shown impressive performance in novel view synthesis via implicit scene representation.

Novel View Synthesis

Paper
Add Code

SPRINT: Scalable Policy Pre-Training via Language Instruction Relabeling

no code implementations • 20 Jun 2023 • Jesse Zhang, Karl Pertsch, Jiahui Zhang, Joseph J. Lim

Pre-training robot policies with a rich set of skills can substantially accelerate the learning of downstream tasks.

Paper
Add Code

Weakly Supervised 3D Open-vocabulary Segmentation

1 code implementation • NeurIPS 2023 • Kunhao Liu, Fangneng Zhan, Jiahui Zhang, Muyu Xu, Yingchen Yu, Abdulmotaleb El Saddik, Christian Theobalt, Eric Xing, Shijian Lu

Open-vocabulary segmentation of 3D scenes is a fundamental function of human perception and thus a crucial objective in computer vision research.

Segmentation

Paper
Code

POCE: Pose-Controllable Expression Editing

no code implementations • 18 Apr 2023 • Rongliang Wu, Yingchen Yu, Fangneng Zhan, Jiahui Zhang, Shengcai Liao, Shijian Lu

POCE achieves the more accessible and realistic pose-controllable expression editing by mapping face images into UV space, where facial expressions and head poses can be disentangled and edited separately.

Paper
Add Code

Audio-Driven Talking Face Generation with Diverse yet Realistic Facial Animations

no code implementations • 18 Apr 2023 • Rongliang Wu, Yingchen Yu, Fangneng Zhan, Jiahui Zhang, Xiaoqin Zhang, Shijian Lu

To accommodate fair variation of plausible facial animations for the same audio, we design a transformer-based probabilistic mapping network that can model the variational facial animation distribution conditioned upon the input audio and autoregressively convert the audio signals into a facial animation sequence.

Talking Face Generation

Paper
Add Code

StyleRF: Zero-shot 3D Style Transfer of Neural Radiance Fields

1 code implementation • CVPR 2023 • Kunhao Liu, Fangneng Zhan, YiWen Chen, Jiahui Zhang, Yingchen Yu, Abdulmotaleb El Saddik, Shijian Lu, Eric Xing

In addition, it transforms the grid features according to the reference style which directly leads to high-quality zero-shot style transfer.

Style Transfer

133

Paper
Code

Regularized Vector Quantization for Tokenized Image Synthesis

no code implementations • CVPR 2023 • Jiahui Zhang, Fangneng Zhan, Christian Theobalt, Shijian Lu

The first is a prior distribution regularization which measures the discrepancy between a prior token distribution and the predicted token distribution to avoid codebook collapse and low codebook utilization.

Image Generation Quantization

Paper
Add Code

Latent Multi-Relation Reasoning for GAN-Prior based Image Super-Resolution

no code implementations • 4 Aug 2022 • Jiahui Zhang, Fangneng Zhan, Yingchen Yu, Rongliang Wu, Xiaoqin Zhang, Shijian Lu

In addition, stochastic noises fed to the generator are employed for unconditional detail generation, which tends to produce unfaithful details that compromise the fidelity of the generated SR image.

Attribute Code Generation +3

Paper
Add Code

RenderNet: Visual Relocalization Using Virtual Viewpoints in Large-Scale Indoor Environments

no code implementations • 26 Jul 2022 • Jiahui Zhang, Shitao Tang, Kejie Qiu, Rui Huang, Chuan Fang, Le Cui, Zilong Dong, Siyu Zhu, Ping Tan

Visual relocalization has been a widely discussed problem in 3D vision: given a pre-constructed 3D visual map, the 6 DoF (Degrees-of-Freedom) pose of a query image is estimated.

Image Retrieval Retrieval +1

Paper
Add Code

Auto-regressive Image Synthesis with Integrated Quantization

no code implementations • 21 Jul 2022 • Fangneng Zhan, Yingchen Yu, Rongliang Wu, Jiahui Zhang, Kaiwen Cui, Changgong Zhang, Shijian Lu

Extensive experiments over multiple conditional image generation tasks show that our method achieves superior diverse image generation performance qualitatively and quantitatively as compared with the state-of-the-art.

Conditional Image Generation Inductive Bias +1

Paper
Add Code

VMRF: View Matching Neural Radiance Fields

no code implementations • 6 Jul 2022 • Jiahui Zhang, Fangneng Zhan, Rongliang Wu, Yingchen Yu, Wenqing Zhang, Bai Song, Xiaoqin Zhang, Shijian Lu

With the feature transport plan as the guidance, a novel pose calibration technique is designed which rectifies the initially randomized camera poses by predicting relative pose transformations between the pair of rendered and real images.

Novel View Synthesis

Paper
Add Code

Towards Counterfactual Image Manipulation via CLIP

1 code implementation • 6 Jul 2022 • Yingchen Yu, Fangneng Zhan, Rongliang Wu, Jiahui Zhang, Shijian Lu, Miaomiao Cui, Xuansong Xie, Xian-Sheng Hua, Chunyan Miao

In addition, we design a simple yet effective scheme that explicitly maps CLIP embeddings (of target text) to the latent space and fuses them with latent codes for effective latent code optimization and accurate editing.

counterfactual Image Manipulation

Paper
Code

Marginal Contrastive Correspondence for Guided Image Generation

no code implementations • CVPR 2022 • Fangneng Zhan, Yingchen Yu, Rongliang Wu, Jiahui Zhang, Shijian Lu, Changgong Zhang

We design a Marginal Contrastive Learning Network (MCL-Net) that explores contrastive learning to learn domain-invariant features for realistic exemplar-based image translation.

Contrastive Learning Image Generation +2

Paper
Add Code

Modulated Contrast for Versatile Image Synthesis

1 code implementation • CVPR 2022 • Fangneng Zhan, Jiahui Zhang, Yingchen Yu, Rongliang Wu, Shijian Lu

Perceiving the similarity between images has been a long-standing and fundamental problem underlying various visual generation tasks.

Contrastive Learning Image Generation

Paper
Code

QuadTree Attention for Vision Transformers

1 code implementation • ICLR 2022 • Shitao Tang, Jiahui Zhang, Siyu Zhu, Ping Tan

Transformers have been successful in many vision tasks, thanks to their capability of capturing long-range dependency.

object-detection Object Detection +2

322

Paper
Code

An Intelligent Self-driving Truck System For Highway Transportation

no code implementations • 31 Dec 2021 • Dawei Wang, Lingping Gao, Ziquan Lan, Wei Li, Jiaping Ren, Jiahui Zhang, Peng Zhang, Pei Zhou, Shengao Wang, Jia Pan, Dinesh Manocha, Ruigang Yang

Recently, there have been many advances in autonomous driving society, attracting a lot of attention from academia and industry.

Autonomous Driving Decision Making

Paper
Add Code

Multimodal Image Synthesis and Editing: The Generative AI Era

2 code implementations • 27 Dec 2021 • Fangneng Zhan, Yingchen Yu, Rongliang Wu, Jiahui Zhang, Shijian Lu, Lingjie Liu, Adam Kortylewski, Christian Theobalt, Eric Xing

With superb power in modeling the interaction among multimodal information, multimodal image synthesis and editing has become a hot research topic in recent years.

Image Generation

753

Paper
Code

Awakening Latent Grounding from Pretrained Language Models for Semantic Parsing

1 code implementation • Findings (ACL) 2021 • Qian Liu, Dejian Yang, Jiahui Zhang, Jiaqi Guo, Bin Zhou, Jian-Guang Lou

Recent years pretrained language models (PLMs) hit a success on several downstream tasks, showing their power on modeling language.

Semantic Parsing Text-To-SQL

360

Paper
Code

Learning to Match Features with Seeded Graph Matching Network

1 code implementation • ICCV 2021 • Hongkai Chen, Zixin Luo, Jiahui Zhang, Lei Zhou, Xuyang Bai, Zeyu Hu, Chiew-Lan Tai, Long Quan

2) Seeded Graph Neural Network, which utilizes seed matches to pass messages within/across images and predicts assignment costs.

Graph Matching

130

Paper
Code

Bi-level Feature Alignment for Versatile Image Translation and Manipulation

2 code implementations • 7 Jul 2021 • Fangneng Zhan, Yingchen Yu, Rongliang Wu, Jiahui Zhang, Kaiwen Cui, Aoran Xiao, Shijian Lu, Chunyan Miao

This paper presents a versatile image translation and manipulation framework that achieves accurate semantic and style guidance in image generation by explicitly building a correspondence.

Image Generation Translation

Paper
Code

Blind Image Super-Resolution via Contrastive Representation Learning

no code implementations • 1 Jul 2021 • Jiahui Zhang, Shijian Lu, Fangneng Zhan, Yingchen Yu

Extensive experiments on synthetic datasets and real images show that the proposed CRL-SR can handle multi-modal and spatially variant degradation effectively under blind settings and it also outperforms state-of-the-art SR methods qualitatively and quantitatively.

Contrastive Learning Image Super-Resolution +1

Paper
Add Code

KFNet: Learning Temporal Camera Relocalization using Kalman Filtering

1 code implementation • CVPR 2020 • Lei Zhou, Zixin Luo, Tianwei Shen, Jiahui Zhang, Mingmin Zhen, Yao Yao, Tian Fang, Long Quan

Temporal camera relocalization estimates the pose with respect to each video frame in sequence, as opposed to one-shot relocalization which focuses on a still image.

Camera Relocalization

210

Paper
Code

ASLFeat: Learning Local Features of Accurate Shape and Localization

4 code implementations • CVPR 2020 • Zixin Luo, Lei Zhou, Xuyang Bai, Hongkai Chen, Jiahui Zhang, Yao Yao, Shiwei Li, Tian Fang, Long Quan

This work focuses on mitigating two limitations in the joint learning of local feature detectors and descriptors.

3D Reconstruction Keypoint detection and image matching

301

Paper
Code

Self-Supervised Learning of Depth and Motion Under Photometric Inconsistency

1 code implementation • 19 Sep 2019 • Tianwei Shen, Lei Zhou, Zixin Luo, Yao Yao, Shiwei Li, Jiahui Zhang, Tian Fang, Long Quan

The self-supervised learning of depth and pose from monocular sequences provides an attractive solution by using the photometric consistency of nearby frames as it depends much less on the ground-truth data.

Pose Estimation Self-Supervised Learning

197

Paper
Code

Learning Two-View Correspondences and Geometry Using Order-Aware Network

1 code implementation • ICCV 2019 • Jiahui Zhang, Dawei Sun, Zixin Luo, Anbang Yao, Lei Zhou, Tianwei Shen, Yurong Chen, Long Quan, Hongen Liao

First, to capture the local context of sparse correspondences, the network clusters unordered input correspondences by learning a soft assignment matrix.

Vocal Bursts Valence Prediction

248

Paper
Code

Efficient Semantic Scene Completion Network with Spatial Group Convolution

1 code implementation • ECCV 2018 • Jiahui Zhang, Hao Zhao, Anbang Yao, Yurong Chen, Li Zhang, Hongen Liao

We introduce Spatial Group Convolution (SGC) for accelerating the computation of 3D dense prediction tasks.

Ranked #9 on 3D Semantic Scene Completion on SemanticKITTI

3D Semantic Scene Completion valid

Paper
Code

ContextDesc: Local Descriptor Augmentation with Cross-Modality Context

1 code implementation • CVPR 2019 • Zixin Luo, Tianwei Shen, Lei Zhou, Jiahui Zhang, Yao Yao, Shiwei Li, Tian Fang, Long Quan

Most existing studies on learning local features focus on the patch-based descriptions of individual keypoints, whereas neglecting the spatial relations established from their keypoint locations.

Geometric Matching

227

Paper
Code

Reconstruction and Registration of Large-Scale Medical Scene Using Point Clouds Data from Different Modalities

no code implementations • 5 Sep 2018 • Ke Wang, Han Song, Jiahui Zhang, Xinran Zhang, Hongen Liao

In this paper, we proposed a method which can fuse different modalities 3D data to get a large-scale and dense point cloud.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.