1 code implementation • 27 Feb 2024 • Tao Tang, Guangrun Wang, Yixing Lao, Peng Chen, Jie Liu, Liang Lin, Kaicheng Yu, Xiaodan Liang
Through extensive experiments across various datasets and scenes, we demonstrate the effectiveness of our approach in facilitating better interaction between LiDAR and camera modalities within a unified neural field.
no code implementations • 12 Dec 2023 • Hu Zhang, Jianhua Xu, Tao Tang, Haiyang Sun, Xin Yu, Zi Huang, Kaicheng Yu
OpenSight utilizes 2D-3D geometric priors for the initial discernment and localization of generic objects, followed by a more specific semantic interpretation of the detected objects.
no code implementations • 28 Sep 2023 • Lei Yang, Tao Tang, Jun Li, Peng Chen, Kun Yuan, Li Wang, Yi Huang, Xinyu Zhang, Kaicheng Yu
In essence, we regress the height to the ground to achieve a distance-agnostic formulation to ease the optimization process of camera-only perception methods.
no code implementations • 11 Sep 2023 • Chunyong Hu, Hang Zheng, Kun Li, Jianyun Xu, Weibo Mao, Maochun Luo, Lingxuan Wang, Mingxia Chen, Qihao Peng, Kaixuan Liu, Yiru Zhao, Peihan Hao, Minzhe Liu, Kaicheng Yu
Multi-sensor modal fusion has demonstrated strong advantages in 3D object detection tasks.
1 code implementation • 18 Aug 2023 • Xiaoyang Wu, Zhuotao Tian, Xin Wen, Bohao Peng, Xihui Liu, Kaicheng Yu, Hengshuang Zhao
In contrast, such privilege has not yet fully benefited 3D deep learning, mainly due to the limited availability of large-scale 3D datasets.
Ranked #3 on 3D Semantic Segmentation on SemanticKITTI (val mIoU metric, using extra training data)
no code implementations • 3 Aug 2023 • Kairui Yang, Enhui Ma, Jibin Peng, Qing Guo, Di Lin, Kaicheng Yu
To this end, we propose a two-stage generative method, dubbed BEVControl, that can generate accurate foreground and background contents.
1 code implementation • 2 Aug 2023 • Tengju Ye, Wei Jing, Chunyong Hu, Shikun Huang, Lingping Gao, Fangzhen Li, Jingke Wang, Ke Guo, Wencong Xiao, Weibo Mao, Hang Zheng, Kun Li, Junbo Chen, Kaicheng Yu
Building a multi-modality multi-task neural network toward accurate and robust performance is a de-facto standard in perception task of autonomous driving.
no code implementations • 24 Jul 2023 • Shangzhan Zhang, Sida Peng, Yinji ShenTu, Qing Shuai, Tianrun Chen, Kaicheng Yu, Hujun Bao, Xiaowei Zhou
We extensively evaluate our approach on various scenes and show that our approach achieves spatially and temporally consistent editing results.
1 code implementation • 20 Apr 2023 • Tang Tao, Longfei Gao, Guangrun Wang, Yixing Lao, Peng Chen, Hengshuang Zhao, Dayang Hao, Xiaodan Liang, Mathieu Salzmann, Kaicheng Yu
We address this challenge by formulating, to the best of our knowledge, the first differentiable end-to-end LiDAR rendering framework, LiDAR-NeRF, leveraging a neural radiance field (NeRF) to facilitate the joint learning of geometry and the attributes of 3D points.
1 code implementation • CVPR 2023 • Lei Yang, Kaicheng Yu, Tao Tang, Jun Li, Kun Yuan, Li Wang, Xinyu Zhang, Peng Chen
In essence, instead of predicting the pixel-wise depth, we regress the height to the ground to achieve a distance-agnostic formulation to ease the optimization process of camera-only perception methods.
Ranked #3 on 3D Object Detection on Rope3D
no code implementations • CVPR 2023 • Shangzhan Zhang, Sida Peng, Tianrun Chen, Linzhan Mou, Haotong Lin, Kaicheng Yu, Yiyi Liao, Xiaowei Zhou
We introduce a novel approach that takes a single semantic mask as input to synthesize multi-view consistent color images of natural scenes, trained with a collection of single images from the Internet.
no code implementations • 18 Nov 2022 • Bicheng Guo, Shuxuan Guo, Miaojing Shi, Peng Chen, Shibo He, Jiming Chen, Kaicheng Yu
Differentiable architecture search (DARTS) has been a mainstream direction in automatic machine learning.
Ranked #10 on Neural Architecture Search on NAS-Bench-201, CIFAR-100
1 code implementation • 16 Oct 2022 • Tao Tang, Changlin Li, Guangrun Wang, Kaicheng Yu, Xiaojun Chang, Xiaodan Liang
Despite the success, its development and application on self-supervised vision transformers have been hindered by several barriers, including the high search cost, the lack of supervision, and the unsuitable search space.
1 code implementation • 30 May 2022 • Kaicheng Yu, Tang Tao, Hongwei Xie, Zhiwei Lin, Zhongwei Wu, Zhongyu Xia, TingTing Liang, Haiyang Sun, Jiong Deng, Dayang Hao, Yongtao Wang, Xiaodan Liang, Bing Wang
There are two critical sensors for 3D perception in autonomous driving, the camera and the LiDAR.
2 code implementations • 27 May 2022 • TingTing Liang, Hongwei Xie, Kaicheng Yu, Zhongyu Xia, Zhiwei Lin, Yongtao Wang, Tao Tang, Bing Wang, Zhi Tang
Fusing the camera and LiDAR information has become a de-facto standard for 3D object detection tasks.
2 code implementations • CVPR 2022 • Sihao Lin, Hongwei Xie, Bing Wang, Kaicheng Yu, Xiaojun Chang, Xiaodan Liang, Gang Wang
To this end, we propose a novel one-to-all spatial matching knowledge distillation approach.
1 code implementation • ICLR 2022 • Yash Mehta, Colin White, Arber Zela, Arjun Krishnakumar, Guri Zabergja, Shakiba Moradian, Mahmoud Safari, Kaicheng Yu, Frank Hutter
The release of tabular benchmarks, such as NAS-Bench-101 and NAS-Bench-201, has significantly lowered the computational overhead for conducting scientific research in neural architecture search (NAS).
no code implementations • 4 Oct 2021 • Kaicheng Yu, René Ranftl, Mathieu Salzmann
Weight sharing promises to make neural architecture search (NAS) tractable even on commodity hardware.
1 code implementation • CVPR 2021 • Kaicheng Yu, Rene Ranftl, Mathieu Salzmann
Weight sharing has become a de facto standard in neural architecture search because it enables the search to be done on commodity hardware.
no code implementations • ICCV 2021 • Xiaobin Hu, Wenqi Ren, Kaicheng Yu, Kaihao Zhang, Xiaochun Cao, Wei Liu, Bjoern Menze
Multi-scale and multi-patch deep models have been shown effective in removing blurs of dynamic scenes.
no code implementations • 9 Mar 2020 • Kaicheng Yu, Rene Ranftl, Mathieu Salzmann
Weight sharing promises to make neural architecture search (NAS) tractable even on commodity hardware.
no code implementations • ICCV 2019 • Wei Wang, Kaicheng Yu, Joachim Hugonot, Pascal Fua, Mathieu Salzmann
State-of-the-art segmentation methods rely on very deep networks that are not always easy to train without very large training datasets and tend to be relatively slow to run on standard GPUs.
no code implementations • ICLR 2019 • Yassine Benyahia, Kaicheng Yu, Kamil Bennani-Smires, Martin Jaggi, Anthony Davison, Mathieu Salzmann, Claudiu Musat
We identify a phenomenon, which we refer to as multi-model forgetting, that occurs when sequentially training multiple deep networks with partially-shared parameters; the performance of previously-trained models degrades as one optimizes a subsequent one, due to the overwriting of shared parameters.
1 code implementation • ICLR 2020 • Kaicheng Yu, Christian Sciuto, Martin Jaggi, Claudiu Musat, Mathieu Salzmann
Neural Architecture Search (NAS) aims to facilitate the design of deep networks for new tasks.
no code implementations • 27 Nov 2018 • Wei Wang, Kaicheng Yu, Joachim Hugonot, Pascal Fua, Mathieu Salzmann
As evidenced by our results on standard hand segmentation benchmarks and on our own dataset, our approach outperforms these other, simpler recurrent segmentation techniques, as well as the state-of-the-art hand segmentation one.
1 code implementation • ECCV 2018 • Kaicheng Yu, Mathieu Salzmann
We then propose to make use of a square-root normalization, which makes the distribution of the resulting representation converge to a Gaussian, with which most classifiers of recent first-order networks complying.
1 code implementation • 23 Jan 2018 • Kaicheng Yu, Mathieu Salzmann
Our approach is motivated by a statistical analysis of the network's activations, relying on operations that lead to a Gaussian-distributed final representation, as inherently used by first-order deep networks.
no code implementations • 20 Mar 2017 • Kaicheng Yu, Mathieu Salzmann
By performing linear combinations and element-wise nonlinear operations, these networks can be thought of as extracting solely first-order information from an input image.