Search Results for author: Yan Di

Found 22 papers, 10 papers with code

EchoScene: Indoor Scene Generation via Information Echo over Scene Graph Diffusion

1 code implementation • 2 May 2024 • Guangyao Zhai, Evin Pınar Örnek, Dave Zhenyu Chen, Ruotong Liao, Yan Di, Nassir Navab, Federico Tombari, Benjamin Busam

The scheme ensures that the denoising processes are influenced by a holistic understanding of the scene graph, facilitating the generation of globally coherent scenes.

3D Object Retrieval Denoising +2

Paper
Code

GeoGaussian: Geometry-aware Gaussian Splatting for Scene Rendering

no code implementations • 17 Mar 2024 • Yanyan Li, Chenyu Lyu, Yan Di, Guangyao Zhai, Gim Hee Lee, Federico Tombari

During the Gaussian Splatting optimization process, the scene's geometry can gradually deteriorate if its structure is not deliberately preserved, especially in non-textured regions such as walls, ceilings, and furniture surfaces.

Novel View Synthesis

Paper
Add Code

KP-RED: Exploiting Semantic Keypoints for Joint 3D Shape Retrieval and Deformation

1 code implementation • 15 Mar 2024 • Ruida Zhang, Chenyangguang Zhang, Yan Di, Fabian Manhardt, Xingyu Liu, Federico Tombari, Xiangyang Ji

In this paper, we present KP-RED, a unified KeyPoint-driven REtrieval and Deformation framework that takes object scans as input and jointly retrieves and deforms the most geometrically similar CAD models from a pre-processed database to tightly match the target.

3D Shape Retrieval Retrieval

Paper
Code

FMGS: Foundation Model Embedded 3D Gaussian Splatting for Holistic 3D Scene Understanding

no code implementations • 3 Jan 2024 • Xingxing Zuo, Pouya Samangouei, Yunwen Zhou, Yan Di, Mingyang Li

This is achieved by distilling feature maps generated from image-based foundation models into those rendered from our 3D model.

object-detection Object Detection +1

Paper
Add Code

D-SCo: Dual-Stream Conditional Diffusion for Monocular Hand-Held Object Reconstruction

no code implementations • 23 Nov 2023 • Bowen Fu, Gu Wang, Chenyangguang Zhang, Yan Di, Ziqin Huang, Zhiying Leng, Fabian Manhardt, Xiangyang Ji, Federico Tombari

Second, we introduce a dual-stream denoiser to semantically and geometrically model hand-object interactions with a novel unified hand-object semantic embedding, enhancing the reconstruction performance of the hand-occluded region of the object.

Denoising Object +1

Paper
Add Code

HiPose: Hierarchical Binary Surface Encoding and Correspondence Pruning for RGB-D 6DoF Object Pose Estimation

1 code implementation • 21 Nov 2023 • Yongliang Lin, Yongzhi Su, Praveen Nathan, Sandeep Inuganti, Yan Di, Martin Sundermeyer, Fabian Manhardt, Didier Stricker, Jason Rambach, Yu Zhang

In this work, we present a novel dense-correspondence method for 6DoF object pose estimation from a single RGB-D image.

Pose Estimation

Paper
Code

SecondPose: SE(3)-Consistent Dual-Stream Feature Fusion for Category-Level Pose Estimation

1 code implementation • 18 Nov 2023 • Yamei Chen, Yan Di, Guangyao Zhai, Fabian Manhardt, Chenyangguang Zhang, Ruida Zhang, Federico Tombari, Nassir Navab, Benjamin Busam

Leveraging the advantage of DINOv2 in providing SE(3)-consistent semantic features, we hierarchically extract two types of SE(3)-invariant geometric features to further encapsulate local-to-global object-specific information.

Object Pose Estimation

Paper
Code

ShapeMatcher: Self-Supervised Joint Shape Canonicalization, Segmentation, Retrieval and Deformation

1 code implementation • 18 Nov 2023 • Yan Di, Chenyangguang Zhang, Chaowei Wang, Ruida Zhang, Guangyao Zhai, Yanyan Li, Bowen Fu, Xiangyang Ji, Shan Gao

In this paper, we present ShapeMatcher, a unified self-supervised learning framework for joint shape canonicalization, segmentation, retrieval and deformation.

Object Retrieval +2

Paper
Code

MOHO: Learning Single-view Hand-held Object Reconstruction with Multi-view Occlusion-Aware Supervision

no code implementations • 18 Oct 2023 • Chenyangguang Zhang, Guanlong Jiao, Yan Di, Gu Wang, Ziqin Huang, Ruida Zhang, Fabian Manhardt, Bowen Fu, Federico Tombari, Xiangyang Ji

Previous works concerning single-view hand-held object reconstruction typically rely on supervision from 3D ground-truth models, which are hard to collect in real world.

Object Object Reconstruction

Paper
Add Code

SG-Bot: Object Rearrangement via Coarse-to-Fine Robotic Imagination on Scene Graphs

no code implementations • 21 Sep 2023 • Guangyao Zhai, Xiaoni Cai, Dianye Huang, Yan Di, Fabian Manhardt, Federico Tombari, Nassir Navab, Benjamin Busam

In this paper, we present SG-Bot, a novel rearrangement framework that utilizes a coarse-to-fine scheme with a scene graph as the scene representation.

Paper
Add Code

CCD-3DR: Consistent Conditioning in Diffusion for Single-Image 3D Reconstruction

no code implementations • 15 Aug 2023 • Yan Di, Chenyangguang Zhang, Pengyuan Wang, Guangyao Zhai, Ruida Zhang, Fabian Manhardt, Benjamin Busam, Xiangyang Ji, Federico Tombari

However, such strategies fail to consistently align the denoised point cloud with the given image, leading to unstable conditioning and inferior performance.

3D Reconstruction

Paper
Add Code

U-RED: Unsupervised 3D Shape Retrieval and Deformation for Partial Point Clouds

1 code implementation • ICCV 2023 • Yan Di, Chenyangguang Zhang, Ruida Zhang, Fabian Manhardt, Yongzhi Su, Jason Rambach, Didier Stricker, Xiangyang Ji, Federico Tombari

In this paper, we propose U-RED, an Unsupervised shape REtrieval and Deformation pipeline that takes an arbitrary object observation as input, typically captured by RGB images or scans, and jointly retrieves and deforms the geometrically similar CAD models from a pre-established database to tightly match the target.

3D Shape Retrieval Retrieval

Paper
Code

CommonScenes: Generating Commonsense 3D Indoor Scenes with Scene Graph Diffusion

1 code implementation • NeurIPS 2023 • Guangyao Zhai, Evin Pınar Örnek, Shun-Cheng Wu, Yan Di, Federico Tombari, Nassir Navab, Benjamin Busam

The generated scenes can be manipulated by editing the input scene graph and sampling the noise in the diffusion model.

Object Scene Generation

Paper
Code

IPCC-TP: Utilizing Incremental Pearson Correlation Coefficient for Joint Multi-Agent Trajectory Prediction

no code implementations • CVPR 2023 • Dekai Zhu, Guangyao Zhai, Yan Di, Fabian Manhardt, Hendrik Berkemeyer, Tuan Tran, Nassir Navab, Federico Tombari, Benjamin Busam

Reliable multi-agent trajectory prediction is crucial for the safe planning and control of autonomous systems.

Trajectory Prediction

Paper
Add Code

SST: Real-time End-to-end Monocular 3D Reconstruction via Sparse Spatial-Temporal Guidance

no code implementations • 13 Dec 2022 • Chenyangguang Zhang, Zhiqiang Lou, Yan Di, Federico Tombari, Xiangyang Ji

Real-time monocular 3D reconstruction is a challenging problem that remains unsolved.

3D Reconstruction

Paper
Add Code

OPA-3D: Occlusion-Aware Pixel-Wise Aggregation for Monocular 3D Object Detection

no code implementations • 2 Nov 2022 • Yongzhi Su, Yan Di, Fabian Manhardt, Guangyao Zhai, Jason Rambach, Benjamin Busam, Didier Stricker, Federico Tombari

Despite monocular 3D object detection having recently made a significant leap forward thanks to the use of pre-trained depth estimators for pseudo-LiDAR recovery, such two-stage methods typically suffer from overfitting and are incapable of explicitly encapsulating the geometric relation between depth and object bounding box.

Monocular 3D Object Detection Object +1

Paper
Add Code

MonoGraspNet: 6-DoF Grasping with a Single RGB Image

no code implementations • 26 Sep 2022 • Guangyao Zhai, Dianye Huang, Shun-Cheng Wu, HyunJun Jung, Yan Di, Fabian Manhardt, Federico Tombari, Nassir Navab, Benjamin Busam

6-DoF robotic grasping is a long-lasting but unsolved problem.

Robotic Grasping

Paper
Add Code

SSP-Pose: Symmetry-Aware Shape Prior Deformation for Direct Category-Level Object Pose Estimation

no code implementations • 13 Aug 2022 • Ruida Zhang, Yan Di, Fabian Manhardt, Federico Tombari, Xiangyang Ji

In this paper, to handle these shortcomings, we propose an end-to-end trainable network SSP-Pose for category-level pose estimation, which integrates shape priors into a direct pose regression network.

Pose Estimation regression

Paper
Add Code

RBP-Pose: Residual Bounding Box Projection for Category-Level Pose Estimation

1 code implementation • 30 Jul 2022 • Ruida Zhang, Yan Di, Zhiqiang Lou, Fabian Manhardt, Federico Tombari, Xiangyang Ji

Category-level object pose estimation aims to predict the 6D pose as well as the 3D metric size of arbitrary objects from a known set of categories.

Object Pose Estimation

Paper
Code

GPV-Pose: Category-level Object Pose Estimation via Geometry-guided Point-wise Voting

3 code implementations • CVPR 2022 • Yan Di, Ruida Zhang, Zhiqiang Lou, Fabian Manhardt, Xiangyang Ji, Nassir Navab, Federico Tombari

While 6D object pose estimation has recently made a huge leap forward, most methods can still only handle a single or a handful of different objects, which limits their applications.

Ranked #1 on 6D Pose Estimation on LineMOD (Mean ADD-S metric)

6D Pose Estimation 6D Pose Estimation using RGB +3

Paper
Code

SO-Pose: Exploiting Self-Occlusion for Direct 6D Pose Estimation

2 code implementations • ICCV 2021 • Yan Di, Fabian Manhardt, Gu Wang, Xiangyang Ji, Nassir Navab, Federico Tombari

Directly regressing all 6 degrees-of-freedom (6DoF) for the object pose (e. g. the 3D rotation and translation) in a cluttered environment from a single RGB image is a challenging problem.

Ranked #1 on 6D Pose Estimation using RGB on Occlusion LineMOD

6D Pose Estimation 6D Pose Estimation using RGB +1

249

Paper
Code

Monocular Piecewise Depth Estimation in Dynamic Scenes by Exploiting Superpixel Relations

no code implementations • ICCV 2019 • Yan Di, Henrique Morimitsu, Shan Gao, Xiangyang Ji

Our core idea is to predict spatial relations based on the corresponding motion relations.

Monocular Depth Estimation Optical Flow Estimation +2

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.