Search Results for author: Yuenan Hou

Found 27 papers, 18 papers with code

NeRF-Det++: Incorporating Semantic Cues and Perspective-aware Depth Supervision for Indoor Multi-View 3D Detection

1 code implementation • 22 Feb 2024 • Chenxi Huang, Yuenan Hou, Weicai Ye, Di Huang, Xiaoshui Huang, Binbin Lin, Deng Cai, Wanli Ouyang

We project the freely available 3D segmentation annotations onto the 2D plane and leverage the corresponding 2D semantic maps as the supervision signal, significantly enhancing the semantic awareness of multi-view detectors.

Depth Estimation Depth Prediction +1

Paper
Code

A Comprehensive Survey on 3D Content Generation

1 code implementation • 2 Feb 2024 • Jian Liu, Xiaoshui Huang, Tianyu Huang, Lu Chen, Yuenan Hou, Shixiang Tang, Ziwei Liu, Wanli Ouyang, WangMeng Zuo, Junjun Jiang, Xianming Liu

Recent years have witnessed remarkable advances in artificial intelligence generated content(AIGC), with diverse input modalities, e. g., text, image, video, audio and 3D.

372

Paper
Code

Predicting Gradient is Better: Exploring Self-Supervised Learning for SAR ATR with a Joint-Embedding Predictive Architecture

2 code implementations • 26 Nov 2023 • Weijie Li, Yang Wei, Tianpeng Liu, Yuenan Hou, YuXuan Li, Zhen Liu, Yongxiang Liu, Li Liu

Besides, we employ local masks and multi-scale features to accommodate the various small targets in remote sensing.

Representation Learning Self-Supervised Learning

Paper
Code

Point Cloud Pre-training with Diffusion Models

no code implementations • 25 Nov 2023 • Xiao Zheng, Xiaoshui Huang, Guofeng Mei, Yuenan Hou, Zhaoyang Lyu, Bo Dai, Wanli Ouyang, Yongshun Gong

This generator aggregates the features extracted by the backbone and employs them as the condition to guide the point-to-point recovery from the noisy point cloud, thereby assisting the backbone in capturing both local and global geometric priors as well as the global point density distribution of the object.

Point Cloud Pre-training

Paper
Add Code

UniSeg: A Unified Multi-Modal LiDAR Segmentation Network and the OpenPCSeg Codebase

1 code implementation • ICCV 2023 • Youquan Liu, Runnan Chen, Xin Li, Lingdong Kong, Yuchen Yang, Zhaoyang Xia, Yeqi Bai, Xinge Zhu, Yuexin Ma, Yikang Li, Yu Qiao, Yuenan Hou

Besides, we construct the OpenPCSeg codebase, which is the largest and most comprehensive outdoor LiDAR segmentation codebase.

Ranked #2 on 3D Semantic Segmentation on SemanticKITTI (using extra training data)

3D Semantic Segmentation LIDAR Semantic Segmentation +2

296

Paper
Code

Human-centric Scene Understanding for 3D Large-scale Scenarios

1 code implementation • ICCV 2023 • Yiteng Xu, Peishan Cong, Yichen Yao, Runnan Chen, Yuenan Hou, Xinge Zhu, Xuming He, Jingyi Yu, Yuexin Ma

Human-centric scene understanding is significant for real-world applications, but it is extremely challenging due to the existence of diverse human poses and actions, complex human-environment interactions, severe occlusions in crowds, etc.

Action Recognition Scene Understanding +1

Paper
Code

See More and Know More: Zero-shot Point Cloud Segmentation via Multi-modal Visual Data

no code implementations • ICCV 2023 • Yuhang Lu, Qi Jiang, Runnan Chen, Yuenan Hou, Xinge Zhu, Yuexin Ma

They typically align visual features with semantic features obtained from word embedding by the supervision of seen classes' annotations.

Point Cloud Segmentation Zero-Shot Learning

Paper
Add Code

Clothes-Invariant Feature Learning by Causal Intervention for Clothes-Changing Person Re-identification

no code implementations • 10 May 2023 • Xulin Li, Yan Lu, Bin Liu, Yuenan Hou, Yating Liu, Qi Chu, Wanli Ouyang, Nenghai Yu

Clothes-invariant feature extraction is critical to the clothes-changing person re-identification (CC-ReID).

Clothes Changing Person Re-Identification

Paper
Add Code

WildRefer: 3D Object Localization in Large-scale Dynamic Scenes with Multi-modal Visual Data and Natural Language

no code implementations • 12 Apr 2023 • Zhenxiang Lin, Xidong Peng, Peishan Cong, Yuenan Hou, Xinge Zhu, Sibei Yang, Yuexin Ma

We introduce the task of 3D visual grounding in large-scale dynamic scenes based on natural linguistic descriptions and online captured multi-modal visual data, including 2D images and 3D LiDAR point clouds.

Autonomous Driving Object Localization +1

Paper
Add Code

SCPNet: Semantic Scene Completion on Point Cloud

1 code implementation • CVPR 2023 • Zhaoyang Xia, Youquan Liu, Xin Li, Xinge Zhu, Yuexin Ma, Yikang Li, Yuenan Hou, Yu Qiao

We propose a simple yet effective label rectification strategy, which uses off-the-shelf panoptic segmentation labels to remove the traces of dynamic objects in completion labels, greatly improving the performance of deep models especially for those moving objects.

Ranked #1 on 3D Semantic Scene Completion on SemanticKITTI

3D Semantic Scene Completion Knowledge Distillation +3

Paper
Code

Rethinking Range View Representation for LiDAR Segmentation

no code implementations • ICCV 2023 • Lingdong Kong, Youquan Liu, Runnan Chen, Yuexin Ma, Xinge Zhu, Yikang Li, Yuenan Hou, Yu Qiao, Ziwei Liu

We show that, for the first time, a range view method is able to surpass the point, voxel, and multi-view fusion counterparts in the competing LiDAR semantic and panoptic segmentation benchmarks, i. e., SemanticKITTI, nuScenes, and ScribbleKITTI.

Ranked #4 on 3D Semantic Segmentation on SemanticKITTI

3D Semantic Segmentation Autonomous Driving +4

Paper
Add Code

LoGoNet: Towards Accurate 3D Object Detection with Local-to-Global Cross-Modal Fusion

1 code implementation • CVPR 2023 • Xin Li, Tao Ma, Yuenan Hou, Botian Shi, Yuchen Yang, Youquan Liu, Xingjiao Wu, Qin Chen, Yikang Li, Yu Qiao, Liang He

Notably, LoGoNet ranks 1st on Waymo 3D object detection leaderboard and obtains 81. 02 mAPH (L2) detection performance.

3D Object Detection object-detection +1

Paper
Code

CLIP2Scene: Towards Label-efficient 3D Scene Understanding by CLIP

1 code implementation • CVPR 2023 • Runnan Chen, Youquan Liu, Lingdong Kong, Xinge Zhu, Yuexin Ma, Yikang Li, Yuenan Hou, Yu Qiao, Wenping Wang

For the first time, our pre-trained network achieves annotation-free 3D semantic segmentation with 20. 8% and 25. 08% mIoU on nuScenes and ScanNet, respectively.

3D Semantic Segmentation Contrastive Learning +4

132

Paper
Code

EPCL: Frozen CLIP Transformer is An Efficient Point Cloud Encoder

2 code implementations • 8 Dec 2022 • Xiaoshui Huang, Zhou Huang, Sheng Li, Wentao Qu, Tong He, Yuenan Hou, Yifan Zuo, Wanli Ouyang

These token embeddings are concatenated with a task token and fed into the frozen CLIP transformer to learn point cloud representation.

Few-Shot Learning Segmentation +1

266

Paper
Code

Homogeneous Multi-modal Feature Fusion and Interaction for 3D Object Detection

no code implementations • 18 Oct 2022 • Xin Li, Botian Shi, Yuenan Hou, Xingjiao Wu, Tianlong Ma, Yikang Li, Liang He

To address these problems, we construct the homogeneous structure between the point cloud and images to avoid projective information loss by transforming the camera features into the LiDAR 3D space.

3D Object Detection Autonomous Driving +1

Paper
Add Code

Mind the Gap in Distilling StyleGANs

1 code implementation • 18 Aug 2022 • Guodong Xu, Yuenan Hou, Ziwei Liu, Chen Change Loy

To further enhance the semantic consistency between the teacher and student model, we present a latent-direction-based distillation loss that preserves the semantic relations in latent space.

Knowledge Distillation

Paper
Code

Vision-Centric BEV Perception: A Survey

1 code implementation • 4 Aug 2022 • Yuexin Ma, Tai Wang, Xuyang Bai, Huitong Yang, Yuenan Hou, Yaming Wang, Yu Qiao, Ruigang Yang, Dinesh Manocha, Xinge Zhu

In recent years, vision-centric Bird's Eye View (BEV) perception has garnered significant interest from both industry and academia due to its inherent advantages, such as providing an intuitive representation of the world and being conducive to data fusion.

637

Paper
Code

Point-to-Voxel Knowledge Distillation for LiDAR Semantic Segmentation

no code implementations • CVPR 2022 • Yuenan Hou, Xinge Zhu, Yuexin Ma, Chen Change Loy, Yikang Li

This article addresses the problem of distilling knowledge from a large teacher model to a slim student network for LiDAR semantic segmentation.

Ranked #8 on LIDAR Semantic Segmentation on nuScenes (val mIoU metric)

3D Semantic Segmentation Knowledge Distillation +1

Paper
Add Code

STCrowd: A Multimodal Dataset for Pedestrian Perception in Crowded Scenes

1 code implementation • CVPR 2022 • Peishan Cong, Xinge Zhu, Feng Qiao, Yiming Ren, Xidong Peng, Yuenan Hou, Lan Xu, Ruigang Yang, Dinesh Manocha, Yuexin Ma

In addition, considering the property of sparse global distribution and density-varying local distribution of pedestrians, we further propose a novel method, Density-aware Hierarchical heatmap Aggregation (DHA), to enhance pedestrian perception in crowded scenes.

Pedestrian Detection Sensor Fusion

Paper
Code

A Comprehensive Overhaul of Distilling Unconditional GANs

no code implementations • 29 Sep 2021 • Guodong Xu, Yuenan Hou, Ziwei Liu, Chen Change Loy

To further enhance the semantic consistency between the teacher and student model, we present another latent-direction-based distillation loss that preserves the semantic relations in latent space.

Knowledge Distillation

Paper
Add Code

Categorical Relation-Preserving Contrastive Knowledge Distillation for Medical Image Classification

1 code implementation • 7 Jul 2021 • Xiaohan Xing, Yuenan Hou, Hang Li, Yixuan Yuan, Hongsheng Li, Max Q. -H. Meng

With the contribution of the CCD and CRP, our CRCKD algorithm can distill the relational knowledge more comprehensively.

Image Classification Knowledge Distillation +2

Paper
Code

Network Pruning via Resource Reallocation

1 code implementation • 2 Mar 2021 • Yuenan Hou, Zheng Ma, Chunxiao Liu, Zhe Wang, Chen Change Loy

Channel pruning is broadly recognized as an effective approach to obtain a small compact model through eliminating unimportant channels from a large cumbersome network.

Network Pruning

478

Paper
Code

Inter-Region Affinity Distillation for Road Marking Segmentation

1 code implementation • CVPR 2020 • Yuenan Hou, Zheng Ma, Chunxiao Liu, Tak-Wai Hui, Chen Change Loy

We study the problem of distilling knowledge from a large deep teacher network to a much smaller student network for the task of road marking segmentation.

Ranked #1 on Semantic Segmentation on ApolloScape

Knowledge Distillation Lane Detection +1

112

Paper
Code

Learning Lightweight Lane Detection CNNs by Self Attention Distillation

2 code implementations • ICCV 2019 • Yuenan Hou, Zheng Ma, Chunxiao Liu, Chen Change Loy

Training deep models for lane detection is challenging due to the very subtle and sparse supervisory signals inherent in lane annotations.

Ranked #5 on Lane Detection on BDD100K val

Knowledge Distillation Lane Detection +1

1,024

Paper
Code

Agnostic Lane Detection

no code implementations • 2 May 2019 • Yuenan Hou

Lane detection is an important yet challenging task in autonomous driving, which is affected by many factors, e. g., light conditions, occlusions caused by other vehicles, irrelevant markings on the road and the inherent long and thin property of lanes.

Ranked #18 on Lane Detection on TuSimple

Autonomous Driving Instance Segmentation +3

Paper
Add Code

Learning to Steer by Mimicking Features from Heterogeneous Auxiliary Networks

2 code implementations • 7 Nov 2018 • Yuenan Hou, Zheng Ma, Chunxiao Liu, Chen Change Loy

In this paper, we considerably improve the accuracy and robustness of predictions through heterogeneous auxiliary networks feature mimicking, a new and effective training method that provides us with much richer contextual signals apart from steering direction.

Ranked #1 on Steering Control on BDD100K val

Image Segmentation Multi-Task Learning +3

Paper
Code

A novel DDPG method with prioritized experience replay

1 code implementation • IEEE International Conference on Systems, Man and Cybernetics (SMC) 2017 • Yuenan Hou, Lifeng Liu, Qing Wei, Xudong Xu, Chunlin Chen

Recently, a state-of-the-art algorithm, called deep deterministic policy gradient (DDPG), has achieved good performance in many continuous control tasks in the MuJoCo simulator.

Continuous Control OpenAI Gym

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.