Search Results for author: Jiao Dai

Found 11 papers, 2 papers with code

Model Will Tell: Training Membership Inference for Diffusion Models

no code implementations • 13 Mar 2024 • Xiaomeng Fu, Xi Wang, Qiao Li, Jin Liu, Jiao Dai, Jizhong Han

In this paper, we explore a novel perspective for the TMI task by leveraging the intrinsic generative priors within the diffusion model.

Binary Classification

Paper
Add Code

Customize your NeRF: Adaptive Source Driven 3D Scene Editing via Local-Global Iterative Training

no code implementations • 4 Dec 2023 • Runze He, Shaofei Huang, Xuecheng Nie, Tianrui Hui, Luoqi Liu, Jiao Dai, Jizhong Han, Guanbin Li, Si Liu

In this paper, we target the adaptive source driven 3D scene editing task by proposing a CustomNeRF model that unifies a text description or a reference image as the editing prompt.

3D scene Editing

Paper
Add Code

Enriching Phrases with Coupled Pixel and Object Contexts for Panoptic Narrative Grounding

no code implementations • 2 Nov 2023 • Tianrui Hui, Zihan Ding, Junshi Huang, Xiaoming Wei, Xiaolin Wei, Jiao Dai, Jizhong Han, Si Liu

Panoptic narrative grounding (PNG) aims to segment things and stuff objects in an image described by noun phrases of a narrative caption.

Object

Paper
Add Code

OSM-Net: One-to-Many One-shot Talking Head Generation with Spontaneous Head Motions

no code implementations • 28 Sep 2023 • Jin Liu, Xi Wang, Xiaomeng Fu, Yesheng Chai, Cai Yu, Jiao Dai, Jizhong Han

Other works construct one-to-one mapping between audio signal and head motion sequences, introducing ambiguity correspondences into the mapping since people can behave differently in head motions when speaking the same content.

Talking Head Generation Video Generation

Paper
Add Code

Discovering Sounding Objects by Audio Queries for Audio Visual Segmentation

no code implementations • 18 Sep 2023 • Shaofei Huang, Han Li, Yuqing Wang, Hongji Zhu, Jiao Dai, Jizhong Han, Wenge Rong, Si Liu

Explicit object-level semantic correspondence between audio and visual modalities is established by gathering object information from visual features with predefined audio queries.

Object Semantic correspondence

Paper
Add Code

MFR-Net: Multi-faceted Responsive Listening Head Generation via Denoising Diffusion Model

no code implementations • 31 Aug 2023 • Jin Liu, Xi Wang, Xiaomeng Fu, Yesheng Chai, Cai Yu, Jiao Dai, Jizhong Han

Responsive listening head generation is an important task that aims to model face-to-face communication scenarios by generating a listener head video given a speaker video and a listener head image.

Denoising

Paper
Add Code

FONT: Flow-guided One-shot Talking Head Generation with Natural Head Motions

no code implementations • 31 Mar 2023 • Jin Liu, Xi Wang, Xiaomeng Fu, Yesheng Chai, Cai Yu, Jiao Dai, Jizhong Han

Specifically, the head pose prediction module is designed to generate head pose sequences from the source face and driving audio.

Pose Prediction Talking Head Generation +1

Paper
Add Code

OPT: One-shot Pose-Controllable Talking Head Generation

no code implementations • 16 Feb 2023 • Jin Liu, Xi Wang, Xiaomeng Fu, Yesheng Chai, Cai Yu, Jiao Dai, Jizhong Han

To solve the identity mismatch problem and achieve high-quality free pose control, we present One-shot Pose-controllable Talking head generation network (OPT).

Disentanglement Talking Head Generation

Paper
Add Code

Anchor3DLane: Learning to Regress 3D Anchors for Monocular 3D Lane Detection

1 code implementation • CVPR 2023 • Shaofei Huang, Zhenwei Shen, Zehao Huang, Zi-han Ding, Jiao Dai, Jizhong Han, Naiyan Wang, Si Liu

An attempt has been made to get rid of BEV and predict 3D lanes from FV representations directly, while it still underperforms other BEV-based methods given its lack of structured representation for 3D lanes.

Ranked #3 on 3D Lane Detection on Apollo Synthetic 3D Lane

3D Lane Detection

126

Paper
Code

Bridging Search Region Interaction With Template for RGB-T Tracking

1 code implementation • CVPR 2023 • Tianrui Hui, Zizheng Xun, Fengguang Peng, Junshi Huang, Xiaoming Wei, Xiaolin Wei, Jiao Dai, Jizhong Han, Si Liu

To alleviate these limitations, we propose a novel Template-Bridged Search region Interaction (TBSI) module which exploits templates as the medium to bridge the cross-modal interaction between RGB and TIR search regions by gathering and distributing target-relevant object and environment contexts.

Ranked #2 on Rgb-T Tracking on RGBT210

Rgb-T Tracking Template Matching

Paper
Code

LI-Net: Large-Pose Identity-Preserving Face Reenactment Network

no code implementations • 7 Apr 2021 • Jin Liu, Peng Chen, Tao Liang, Zhaoxing Li, Cai Yu, Shuqiao Zou, Jiao Dai, Jizhong Han

Face reenactment is a challenging task, as it is difficult to maintain accurate expression, pose and identity simultaneously.

Face Reenactment

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.