Search Results for author: Hengcan Shi

Found 14 papers, 4 papers with code

DifFUSER: Diffusion Model for Robust Multi-Sensor Fusion in 3D Object Detection and BEV Segmentation

no code implementations • 6 Apr 2024 • Duy-Tho Le, Hengcan Shi, Jianfei Cai, Hamid Rezatofighi

Diffusion models have recently gained prominence as powerful deep generative models, demonstrating unmatched performance across various domains.

3D Object Detection Denoising +2

Paper
Add Code

JRDB-PanoTrack: An Open-world Panoptic Segmentation and Tracking Robotic Dataset in Crowded Human Environments

no code implementations • 2 Apr 2024 • Duy-Tho Le, Chenhui Gou, Stavya Datta, Hengcan Shi, Ian Reid, Jianfei Cai, Hamid Rezatofighi

JRDB-PanoTrack includes (1) various data involving indoor and outdoor crowded scenes, as well as comprehensive 2D and 3D synchronized data modalities; (2) high-quality 2D spatial panoptic segmentation and temporal tracking annotations, with additional 3D label projections for further spatial understanding; (3) diverse object classes for closed- and open-world recognition benchmarks, with OSPA-based metrics for evaluation.

Decision Making Panoptic Segmentation +1

Paper
Add Code

Open-Vocabulary Scene Text Recognition via Pseudo-Image Labeling and Margin Loss

no code implementations • 12 Mar 2024 • Xuhua Ren, Hengcan Shi, Jin Li

In this paper, we propose a novel open-vocabulary text recognition framework, Pseudo-OCR, to recognize OOV words.

Image Inpainting Optical Character Recognition (OCR) +2

Paper
Add Code

Unified Open-Vocabulary Dense Visual Prediction

no code implementations • 17 Jul 2023 • Hengcan Shi, Munawar Hayat, Jianfei Cai

We present a UOVN training mechanism to reduce such gaps.

object-detection Object Detection

Paper
Add Code

CoactSeg: Learning from Heterogeneous Data for New Multiple Sclerosis Lesion Segmentation

1 code implementation • 10 Jul 2023 • Yicheng Wu, Zhonghua Wu, Hengcan Shi, Bjoern Picker, Winston Chong, Jianfei Cai

Moreover, a simple and effective relation regularization is proposed to ensure the longitudinal relations among the three outputs to improve the model learning.

Lesion Segmentation Segmentation

Paper
Code

Open-Vocabulary Object Detection via Scene Graph Discovery

no code implementations • 7 Jul 2023 • Hengcan Shi, Munawar Hayat, Jianfei Cai

However, they only use pairs of nouns and individual objects in VL data, while these data usually contain much more information, such as scene graphs, which are also crucial for OV detection.

Graph Generation Object +5

Paper
Add Code

Class Enhancement Losses with Pseudo Labels for Zero-shot Semantic Segmentation

no code implementations • 18 Jan 2023 • Son Duy Dao, Hengcan Shi, Dinh Phung, Jianfei Cai

Recent mask proposal models have significantly improved the performance of zero-shot semantic segmentation.

Language Modelling Open Vocabulary Semantic Segmentation +4

Paper
Add Code

Transformer Scale Gate for Semantic Segmentation

no code implementations • CVPR 2023 • Hengcan Shi, Munawar Hayat, Jianfei Cai

Effectively encoding multi-scale contextual information is crucial for accurate semantic segmentation.

feature selection Segmentation +1

Paper
Add Code

ProposalCLIP: Unsupervised Open-Category Object Proposal Generation via Exploiting CLIP Cues

no code implementations • CVPR 2022 • Hengcan Shi, Munawar Hayat, Yicheng Wu, Jianfei Cai

Firstly, we analyze CLIP for unsupervised open-category proposal generation and design an objectness score based on our empirical analysis on proposal selection.

Object object-detection +2

Paper
Add Code

Unpaired Referring Expression Grounding via Bidirectional Cross-Modal Matching

no code implementations • 18 Jan 2022 • Hengcan Shi, Munawar Hayat, Jianfei Cai

To avoid the laborious annotation in conventional referring grounding, unpaired referring grounding is introduced, where the training data only contains a number of images and queries without correspondences.

Image-text matching Referring Expression +1

Paper
Add Code

Accurate and Real-time 3D Pedestrian Detection Using an Efficient Attentive Pillar Network

1 code implementation • 31 Dec 2021 • Duy-Tho Le, Hengcan Shi, Hamid Rezatofighi, Jianfei Cai

Efficiently and accurately detecting people from 3D point cloud data is of great importance in many robotic and autonomous driving applications.

Ranked #1 on Birds Eye View Object Detection on KITTI Pedestrian Hard

3D Object Detection Autonomous Driving +3

Paper
Code

Deep Music Retrieval for Fine-Grained Videos by Exploiting Cross-Modal-Encoded Voice-Overs

1 code implementation • 21 Apr 2021 • Tingtian Li, Zixun Sun, Haoruo Zhang, Jin Li, Ziming Wu, Hui Zhan, Yipeng Yu, Hengcan Shi

In this paper, we also investigate the widely added voice-overs in short videos and propose a novel framework to retrieve BGM for fine-grained short videos.

Pseudo Label Retrieval

Paper
Code

Scene Parsing via Integrated Classification Model and Variance-Based Regularization

1 code implementation • CVPR 2019 • Hengcan Shi, Hongliang Li, Qingbo Wu, Zichen Song

On the one hand, the integrated classification model contains multiple classifiers, not only the general classifier but also a refinement classifier to distinguish the confusing categories.

Ranked #1 on Scene Segmentation on SUN-RGBD

Classification General Classification +2

Paper
Code

Key-Word-Aware Network for Referring Expression Image Segmentation

no code implementations • ECCV 2018 • Hengcan Shi, Hongliang Li, Fanman Meng, Qingbo Wu

On the other hand, the relationships of different image regions are not considered as well, even though they are greatly important to eliminate the undesired foreground object in accordance with specific query.

Image Segmentation Object +2

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.