Search Results for author: Wenzhao Zheng

Found 29 papers, 24 papers with code

Structural Deep Metric Learning for Room Layout Estimation

no code implementations • ECCV 2020 • Wenzhao Zheng, Jiwen Lu, Jie zhou

We employ a metric model and a layout encoder to map the RGB images and the ground-truth layouts to the embedding space, respectively, and a layout decoder to map the embeddings to the corresponding layouts, where the whole framework is trained in an end-to-end manner.

Metric Learning Room Layout Estimation

Paper
Add Code

GenAD: Generative End-to-End Autonomous Driving

1 code implementation • 18 Feb 2024 • Wenzhao Zheng, Ruiqi Song, Xianda Guo, Chenming Zhang, Long Chen

We then employ a variational autoencoder to learn the future trajectory distribution in a structural latent space for trajectory prior modeling.

Autonomous Driving motion prediction

Paper
Code

Path Choice Matters for Clear Attribution in Path Methods

1 code implementation • 19 Jan 2024 • Borui Zhang, Wenzhao Zheng, Jie zhou, Jiwen Lu

Rigorousness and clarity are both essential for interpretations of DNNs to engender human trust.

Paper
Code

OccWorld: Learning a 3D Occupancy World Model for Autonomous Driving

1 code implementation • 27 Nov 2023 • Wenzhao Zheng, Weiliang Chen, Yuanhui Huang, Borui Zhang, Yueqi Duan, Jiwen Lu

In this paper, we explore a new framework of learning a world model, OccWorld, in the 3D Occupancy space to simultaneously predict the movement of the ego car and the evolution of the surrounding scenes.

Autonomous Driving

266

Paper
Code

SelfOcc: Self-Supervised Vision-Based 3D Occupancy Prediction

1 code implementation • 21 Nov 2023 • Yuanhui Huang, Wenzhao Zheng, Borui Zhang, Jie zhou, Jiwen Lu

Our SelfOcc outperforms the previous best method SceneRF by 58. 7% using a single frame as input on SemanticKITTI and is the first self-supervised work that produces reasonable 3D occupancy for surround cameras on nuScenes.

Autonomous Driving Monocular Depth Estimation

222

Paper
Code

LiDAR-HMR: 3D Human Mesh Recovery from LiDAR

2 code implementations • 20 Nov 2023 • Bohao Fan, Wenzhao Zheng, Jianjiang Feng, Jie zhou

In recent years, point cloud perception tasks have been garnering increasing attention.

Ranked #1 on 3D Human Pose Estimation on SLOPER4D

3D Human Pose Estimation Human Mesh Recovery

Paper
Code

Exploring Unified Perspective For Fast Shapley Value Estimation

1 code implementation • 2 Nov 2023 • Borui Zhang, Baotong Tian, Wenzhao Zheng, Jie zhou, Jiwen Lu

Shapley values have emerged as a widely accepted and trustworthy tool, grounded in theoretical axioms, for addressing challenges posed by black-box models like deep neural networks.

Paper
Code

Introspective Deep Metric Learning

2 code implementations • 11 Sep 2023 • Chengkun Wang, Wenzhao Zheng, Zheng Zhu, Jie zhou, Jiwen Lu

This paper proposes an introspective deep metric learning (IDML) framework for uncertainty-aware comparisons of images.

Image Retrieval Metric Learning

Paper
Code

PointOcc: Cylindrical Tri-Perspective View for Point-based 3D Semantic Occupancy Prediction

1 code implementation • 31 Aug 2023 • Sicheng Zuo, Wenzhao Zheng, Yuanhui Huang, Jie zhou, Jiwen Lu

To address this, we propose a cylindrical tri-perspective view to represent point clouds effectively and comprehensively and a PointOcc model to process them efficiently.

3D Semantic Occupancy Prediction Autonomous Driving +2

103

Paper
Code

Human-M3: A Multi-view Multi-modal Dataset for 3D Human Pose Estimation in Outdoor Scenes

1 code implementation • 1 Aug 2023 • Bohao Fan, Siqi Wang, Wenxuan Guo, Wenzhao Zheng, Jianjiang Feng, Jie zhou

In this article, we propose Human-M3, an outdoor multi-modal multi-view multi-person human pose database which includes not only multi-view RGB videos of outdoor scenes but also corresponding pointclouds.

3D Human Pose Estimation

Paper
Code

SurroundOcc: Multi-Camera 3D Occupancy Prediction for Autonomous Driving

2 code implementations • ICCV 2023 • Yi Wei, Linqing Zhao, Wenzhao Zheng, Zheng Zhu, Jie zhou, Jiwen Lu

Towards a more comprehensive perception of a 3D scene, in this paper, we propose a SurroundOcc method to predict the 3D occupancy with multi-camera images.

3D Object Detection Autonomous Driving +2

680

Paper
Code

Tri-Perspective View for Vision-Based 3D Semantic Occupancy Prediction

2 code implementations • CVPR 2023 • Yuanhui Huang, Wenzhao Zheng, Yunpeng Zhang, Jie zhou, Jiwen Lu

To lift image features to the 3D TPV space, we further propose a transformer-based TPV encoder (TPVFormer) to obtain the TPV features effectively.

Ranked #1 on Prediction Of Occupancy Grid Maps on nuScenes

3D Semantic Scene Completion Autonomous Driving +1

4,813

Paper
Code

Deep Factorized Metric Learning

1 code implementation • CVPR 2023 • Chengkun Wang, Wenzhao Zheng, Junlong Li, Jie zhou, Jiwen Lu

Learning a generalizable and comprehensive similarity metric to depict the semantic discrepancies between images is the foundation of many computer vision tasks.

Image Classification Metric Learning

Paper
Code

Bort: Towards Explainable Neural Networks with Bounded Orthogonal Constraint

1 code implementation • 18 Dec 2022 • Borui Zhang, Wenzhao Zheng, Jie zhou, Jiwen Lu

Deep learning has revolutionized human society, yet the black-box nature of deep neural networks hinders further application to reliability-demanded industries.

Paper
Code

Probabilistic Deep Metric Learning for Hyperspectral Image Classification

1 code implementation • 15 Nov 2022 • Chengkun Wang, Wenzhao Zheng, Xian Sun, Jiwen Lu, Jie zhou

We propose to learn a global probabilistic distribution for each pixel in the patch and a probabilistic metric to model the distance between distributions.

Classification Hyperspectral Image Classification +1

Paper
Code

Token-Label Alignment for Vision Transformers

1 code implementation • ICCV 2023 • Han Xiao, Wenzhao Zheng, Zheng Zhu, Jie zhou, Jiwen Lu

Data mixing strategies (e. g., CutMix) have shown the ability to greatly improve the performance of convolutional neural networks (CNNs).

Image Classification Semantic Segmentation +1

Paper
Code

OPERA: Omni-Supervised Representation Learning with Hierarchical Supervisions

1 code implementation • ICCV 2023 • Chengkun Wang, Wenzhao Zheng, Zheng Zhu, Jie zhou, Jiwen Lu

The pretrain-finetune paradigm in modern computer vision facilitates the success of self-supervised learning, which tends to achieve better transferability than supervised learning.

Image Classification object-detection +3

Paper
Code

A Simple Baseline for Multi-Camera 3D Object Detection

1 code implementation • 22 Aug 2022 • Yunpeng Zhang, Wenzhao Zheng, Zheng Zhu, Guan Huang, Jie zhou, Jiwen Lu

First, we extract multi-scale features and generate the perspective object proposals on each monocular image.

Autonomous Driving Monocular 3D Object Detection +2

Paper
Code

BEVerse: Unified Perception and Prediction in Birds-Eye-View for Vision-Centric Autonomous Driving

1 code implementation • 19 May 2022 • Yunpeng Zhang, Zheng Zhu, Wenzhao Zheng, JunJie Huang, Guan Huang, Jie zhou, Jiwen Lu

Specifically, BEVerse first performs shared feature extraction and lifting to generate 4D BEV representations from multi-timestamp and multi-view images.

Ranked #15 on Robust Camera Only 3D Object Detection on nuScenes-C

3D Object Detection Autonomous Driving +4

368

Paper
Code

Introspective Deep Metric Learning for Image Retrieval

2 code implementations • 9 May 2022 • Wenzhao Zheng, Chengkun Wang, Jie zhou, Jiwen Lu

This paper proposes an introspective deep metric learning (IDML) framework for uncertainty-aware comparisons of images.

Image Classification Image Retrieval +2

Paper
Code

SurroundDepth: Entangling Surrounding Views for Self-Supervised Multi-Camera Depth Estimation

1 code implementation • 7 Apr 2022 • Yi Wei, Linqing Zhao, Wenzhao Zheng, Zheng Zhu, Yongming Rao, Guan Huang, Jiwen Lu, Jie zhou

In this paper, we propose a SurroundDepth method to incorporate the information from multiple surrounding views to predict depth maps across cameras.

Autonomous Driving Monocular Depth Estimation

237

Paper
Code

Attributable Visual Similarity Learning

1 code implementation • CVPR 2022 • Borui Zhang, Wenzhao Zheng, Jie zhou, Jiwen Lu

This paper proposes an attributable visual similarity learning (AVSL) framework for a more accurate and explainable similarity measure between images.

Ranked #3 on Metric Learning on CARS196 (using extra training data)

Metric Learning Semantic Similarity +1

Paper
Code

A Roadmap for Big Model

no code implementations • 26 Mar 2022 • Sha Yuan, Hanyu Zhao, Shuai Zhao, Jiahong Leng, Yangxiao Liang, Xiaozhi Wang, Jifan Yu, Xin Lv, Zhou Shao, Jiaao He, Yankai Lin, Xu Han, Zhenghao Liu, Ning Ding, Yongming Rao, Yizhao Gao, Liang Zhang, Ming Ding, Cong Fang, Yisen Wang, Mingsheng Long, Jing Zhang, Yinpeng Dong, Tianyu Pang, Peng Cui, Lingxiao Huang, Zheng Liang, HuaWei Shen, HUI ZHANG, Quanshi Zhang, Qingxiu Dong, Zhixing Tan, Mingxuan Wang, Shuo Wang, Long Zhou, Haoran Li, Junwei Bao, Yingwei Pan, Weinan Zhang, Zhou Yu, Rui Yan, Chence Shi, Minghao Xu, Zuobai Zhang, Guoqiang Wang, Xiang Pan, Mengjie Li, Xiaoyu Chu, Zijun Yao, Fangwei Zhu, Shulin Cao, Weicheng Xue, Zixuan Ma, Zhengyan Zhang, Shengding Hu, Yujia Qin, Chaojun Xiao, Zheni Zeng, Ganqu Cui, Weize Chen, Weilin Zhao, Yuan YAO, Peng Li, Wenzhao Zheng, Wenliang Zhao, Ziyi Wang, Borui Zhang, Nanyi Fei, Anwen Hu, Zenan Ling, Haoyang Li, Boxi Cao, Xianpei Han, Weidong Zhan, Baobao Chang, Hao Sun, Jiawen Deng, Chujie Zheng, Juanzi Li, Lei Hou, Xigang Cao, Jidong Zhai, Zhiyuan Liu, Maosong Sun, Jiwen Lu, Zhiwu Lu, Qin Jin, Ruihua Song, Ji-Rong Wen, Zhouchen Lin, LiWei Wang, Hang Su, Jun Zhu, Zhifang Sui, Jiajun Zhang, Yang Liu, Xiaodong He, Minlie Huang, Jian Tang, Jie Tang

With the rapid development of deep learning, training Big Models (BMs) for multiple downstream tasks becomes a popular paradigm.

Language Modelling Machine Translation +1

Paper
Add Code

Dimension Embeddings for Monocular 3D Object Detection

no code implementations • CVPR 2022 • Yunpeng Zhang, Wenzhao Zheng, Zheng Zhu, Guan Huang, Dalong Du, Jie zhou, Jiwen Lu

In this paper, we propose a general method to learn appropriate embeddings for dimension estimation in monocular 3D object detection.

Monocular 3D Object Detection Object +1

Paper
Add Code

Deep Relational Metric Learning

1 code implementation • ICCV 2021 • Wenzhao Zheng, Borui Zhang, Jiwen Lu, Jie zhou

This paper presents a deep relational metric learning (DRML) framework for image clustering and retrieval.

Image Clustering Metric Learning +1

Paper
Code

Deep Compositional Metric Learning

1 code implementation • CVPR 2021 • Wenzhao Zheng, Chengkun Wang, Jiwen Lu, Jie zhou

In this paper, we propose a deep compositional metric learning (DCML) framework for effective and generalizable similarity measurement between images.

Metric Learning

Paper
Code

Deep Metric Learning via Adaptive Learnable Assessment

no code implementations • CVPR 2020 • Wenzhao Zheng, Jiwen Lu, Jie Zhou

In this paper, we propose a deep metric learning via adaptive learnable assessment (DML-ALA) method for image retrieval and clustering, which aims to learn a sample assessment strategy to maximize the generalization of the trained metric.

Clustering Image Retrieval +3

Paper
Add Code

Hardness-Aware Deep Metric Learning

2 code implementations • CVPR 2019 • Wenzhao Zheng, Zhaodong Chen, Jiwen Lu, Jie zhou

This paper presents a hardness-aware deep metric learning (HDML) framework.

Ranked #30 on Metric Learning on CUB-200-2011 (using extra training data)

Image Retrieval Metric Learning

149

Paper
Code

Deep Adversarial Metric Learning

no code implementations • CVPR 2018 • Yueqi Duan, Wenzhao Zheng, Xudong Lin, Jiwen Lu, Jie zhou

Learning an effective distance metric between image pairs plays an important role in visual analysis, where the training procedure largely relies on hard negative samples.

Metric Learning

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.