Search Results for author: Wenwei Zhang

Found 47 papers, 39 papers with code

Calib3D: Calibrating Model Preferences for Reliable 3D Scene Understanding

1 code implementation25 Mar 2024 Lingdong Kong, Xiang Xu, Jun Cen, Wenwei Zhang, Liang Pan, Kai Chen, Ziwei Liu

Safety-critical 3D scene understanding tasks necessitate not only accurate but also confident predictions from 3D perception models.

Data Augmentation Scene Understanding

Agent-FLAN: Designing Data and Methods of Effective Agent Tuning for Large Language Models

1 code implementation19 Mar 2024 Zehui Chen, Kuikun Liu, Qiuchen Wang, Wenwei Zhang, Jiangning Liu, Dahua Lin, Kai Chen, Feng Zhao

Open-sourced Large Language Models (LLMs) have achieved great success in various NLP tasks, however, they are still far inferior to API-based models when acting as agents.

Hallucination

CriticBench: Evaluating Large Language Models as Critic

1 code implementation21 Feb 2024 Tian Lan, Wenwei Zhang, Chen Xu, Heyan Huang, Dahua Lin, Kai Chen, Xian-Ling Mao

Critique ability are crucial in the scalable oversight and self-improvement of Large Language Models (LLMs).

Code Needs Comments: Enhancing Code LLMs with Comment Augmentation

no code implementations20 Feb 2024 Demin Song, Honglin Guo, Yunhua Zhou, Shuhao Xing, Yudong Wang, Zifan Song, Wenwei Zhang, Qipeng Guo, Hang Yan, Xipeng Qiu, Dahua Lin

The programming skill is one crucial ability for Large Language Models (LLMs), necessitating a deep understanding of programming languages (PLs) and their correlation with natural languages (NLs).

Data Augmentation

InternLM-Math: Open Math Large Language Models Toward Verifiable Reasoning

1 code implementation9 Feb 2024 Huaiyuan Ying, Shuo Zhang, Linyang Li, Zhejian Zhou, Yunfan Shao, Zhaoye Fei, Yichuan Ma, Jiawei Hong, Kuikun Liu, Ziyi Wang, Yudong Wang, Zijian Wu, Shuaibin Li, Fengzhe Zhou, Hongwei Liu, Songyang Zhang, Wenwei Zhang, Hang Yan, Xipeng Qiu, Jiayu Wang, Kai Chen, Dahua Lin

We further explore how to use LEAN to solve math problems and study its performance under the setting of multi-task learning which shows the possibility of using LEAN as a unified platform for solving and proving in math.

Data Augmentation GSM8K +3

Can AI Assistants Know What They Don't Know?

1 code implementation24 Jan 2024 Qinyuan Cheng, Tianxiang Sun, Xiangyang Liu, Wenwei Zhang, Zhangyue Yin, ShiMin Li, Linyang Li, Zhengfu He, Kai Chen, Xipeng Qiu

To answer this question, we construct a model-specific "I don't know" (Idk) dataset for an assistant, which contains its known and unknown questions, based on existing open-domain question answering datasets.

Math Open-Domain Question Answering +1

OMG-Seg: Is One Model Good Enough For All Segmentation?

1 code implementation18 Jan 2024 Xiangtai Li, Haobo Yuan, Wei Li, Henghui Ding, Size Wu, Wenwei Zhang, Yining Li, Kai Chen, Chen Change Loy

In this work, we address various segmentation tasks, each traditionally tackled by distinct or partially unified models.

Interactive Segmentation Panoptic Segmentation +3

EmbodiedScan: A Holistic Multi-Modal 3D Perception Suite Towards Embodied AI

1 code implementation26 Dec 2023 Tai Wang, Xiaohan Mao, Chenming Zhu, Runsen Xu, Ruiyuan Lyu, Peisen Li, Xiao Chen, Wenwei Zhang, Kai Chen, Tianfan Xue, Xihui Liu, Cewu Lu, Dahua Lin, Jiangmiao Pang

In the realm of computer vision and robotics, embodied agents are expected to explore their environment and carry out human instructions.

Scene Understanding

CLIM: Contrastive Language-Image Mosaic for Region Representation

1 code implementation18 Dec 2023 Size Wu, Wenwei Zhang, Lumin Xu, Sheng Jin, Wentao Liu, Chen Change Loy

Our experimental results demonstrate that CLIM improves different baseline open-vocabulary object detectors by a large margin on both OV-COCO and OV-LVIS benchmarks.

Object object-detection +1

Mixed Pseudo Labels for Semi-Supervised Object Detection

1 code implementation12 Dec 2023 Zeming Chen, Wenwei Zhang, Xinjiang Wang, Kai Chen, Zhi Wang

While the pseudo-label method has demonstrated considerable success in semi-supervised object detection tasks, this paper uncovers notable limitations within this approach.

 Ranked #1 on Semi-Supervised Object Detection on COCO 100% labeled data (using extra training data)

Object object-detection +3

Fake Alignment: Are LLMs Really Aligned Well?

no code implementations10 Nov 2023 Yixu Wang, Yan Teng, Kexin Huang, Chengqi Lyu, Songyang Zhang, Wenwei Zhang, Xingjun Ma, Yu-Gang Jiang, Yu Qiao, Yingchun Wang

To address this, we introduce the Fake alIgNment Evaluation (FINE) framework and two novel metrics--Consistency Score (CS) and Consistent Safety Score (CSS), which jointly assess two complementary forms of evaluation to quantify fake alignment and obtain corrected performance estimates.

Multiple-choice

OV-PARTS: Towards Open-Vocabulary Part Segmentation

1 code implementation NeurIPS 2023 Meng Wei, Xiaoyu Yue, Wenwei Zhang, Shu Kong, Xihui Liu, Jiangmiao Pang

Secondly, part segmentation introduces an open granularity challenge due to the diverse and often ambiguous definitions of parts in the open world.

Open Vocabulary Semantic Segmentation Segmentation +1

CLIPSelf: Vision Transformer Distills Itself for Open-Vocabulary Dense Prediction

1 code implementation2 Oct 2023 Size Wu, Wenwei Zhang, Lumin Xu, Sheng Jin, Xiangtai Li, Wentao Liu, Chen Change Loy

However, when transferring the vision-language alignment of CLIP from global image representation to local region representation for the open-vocabulary dense prediction tasks, CLIP ViTs suffer from the domain shift from full images to local image regions.

Image Classification Image Segmentation +7

DST-Det: Simple Dynamic Self-Training for Open-Vocabulary Object Detection

1 code implementation2 Oct 2023 Shilin Xu, Xiangtai Li, Size Wu, Wenwei Zhang, Yining Li, Guangliang Cheng, Yunhai Tong, Kai Chen, Chen Change Loy

This work presents a simple yet effective strategy that leverages the zero-shot classification ability of pre-trained vision-language models (VLM), such as CLIP, to directly discover proposals of possible novel classes.

Novel Object Detection Object +5

Object2Scene: Putting Objects in Context for Open-Vocabulary 3D Detection

no code implementations18 Sep 2023 Chenming Zhu, Wenwei Zhang, Tai Wang, Xihui Liu, Kai Chen

Instead of leveraging 2D images, we propose Object2Scene, the first approach that leverages large-scale large-vocabulary 3D object datasets to augment existing 3D scene datasets for open-vocabulary 3D object detection.

3D Object Detection 3D Open-Vocabulary Object Detection +4

Unified Human-Scene Interaction via Prompted Chain-of-Contacts

1 code implementation14 Sep 2023 Zeqi Xiao, Tai Wang, Jingbo Wang, Jinkun Cao, Wenwei Zhang, Bo Dai, Dahua Lin, Jiangmiao Pang

Based on the definition, UniHSI constitutes a Large Language Model (LLM) Planner to translate language prompts into task plans in the form of CoC, and a Unified Controller that turns CoC into uniform task execution.

Language Modelling Large Language Model

MultiModal-GPT: A Vision and Language Model for Dialogue with Humans

1 code implementation8 May 2023 Tao Gong, Chengqi Lyu, Shilong Zhang, Yudong Wang, Miao Zheng, Qian Zhao, Kuikun Liu, Wenwei Zhang, Ping Luo, Kai Chen

To further enhance the ability to chat with humans of the MultiModal-GPT, we utilize language-only instruction-following data to train the MultiModal-GPT jointly.

Instruction Following Language Modelling

Transformer-Based Visual Segmentation: A Survey

2 code implementations19 Apr 2023 Xiangtai Li, Henghui Ding, Haobo Yuan, Wenwei Zhang, Jiangmiao Pang, Guangliang Cheng, Kai Chen, Ziwei Liu, Chen Change Loy

Recently, transformers, a type of neural network based on self-attention originally designed for natural language processing, have considerably surpassed previous convolutional or recurrent approaches in various vision processing tasks.

Autonomous Driving Point Cloud Segmentation +1

RoboBEV: Towards Robust Bird's Eye View Perception under Corruptions

1 code implementation13 Apr 2023 Shaoyuan Xie, Lingdong Kong, Wenwei Zhang, Jiawei Ren, Liang Pan, Kai Chen, Ziwei Liu

Our experiments further demonstrate that pre-training and depth-free BEV transformation has the potential to enhance out-of-distribution robustness.

Robust Camera Only 3D Object Detection

MV-JAR: Masked Voxel Jigsaw and Reconstruction for LiDAR-Based Self-Supervised Pre-Training

1 code implementation CVPR 2023 Runsen Xu, Tai Wang, Wenwei Zhang, Runjian Chen, Jinkun Cao, Jiangmiao Pang, Dahua Lin

This paper introduces the Masked Voxel Jigsaw and Reconstruction (MV-JAR) method for LiDAR-based self-supervised pre-training and a carefully designed data-efficient 3D object detection benchmark on the Waymo dataset.

3D Object Detection object-detection

Aligning Bag of Regions for Open-Vocabulary Object Detection

1 code implementation CVPR 2023 Size Wu, Wenwei Zhang, Sheng Jin, Wentao Liu, Chen Change Loy

The embeddings of regions in a bag are treated as embeddings of words in a sentence, and they are sent to the text encoder of a VLM to obtain the bag-of-regions embedding, which is learned to be aligned to the corresponding features extracted by a frozen VLM.

Ranked #7 on Open Vocabulary Object Detection on MSCOCO (using extra training data)

Object object-detection +2

RTMDet: An Empirical Study of Designing Real-Time Object Detectors

9 code implementations14 Dec 2022 Chengqi Lyu, Wenwei Zhang, Haian Huang, Yue Zhou, Yudong Wang, Yanyi Liu, Shilong Zhang, Kai Chen

In this paper, we aim to design an efficient real-time object detector that exceeds the YOLO series and is easily extensible for many object recognition tasks such as instance segmentation and rotated object detection.

Object object-detection +7

MV-FCOS3D++: Multi-View Camera-Only 4D Object Detection with Pretrained Monocular Backbones

1 code implementation26 Jul 2022 Tai Wang, Qing Lian, Chenming Zhu, Xinge Zhu, Wenwei Zhang

In this technical report, we present our solution, dubbed MV-FCOS3D++, for the Camera-Only 3D Detection track in Waymo Open Dataset Challenge 2022.

object-detection Object Detection +1

MMRotate: A Rotated Object Detection Benchmark using PyTorch

1 code implementation28 Apr 2022 Yue Zhou, Xue Yang, Gefan Zhang, Jiabao Wang, Yanyi Liu, Liping Hou, Xue Jiang, Xingzhao Liu, Junchi Yan, Chengqi Lyu, Wenwei Zhang, Kai Chen

We present an open-source toolbox, named MMRotate, which provides a coherent algorithm framework of training, inferring, and evaluation for the popular rotated object detection algorithm based on deep learning.

Object object-detection +1

Dense Siamese Network for Dense Unsupervised Learning

1 code implementation21 Mar 2022 Wenwei Zhang, Jiangmiao Pang, Kai Chen, Chen Change Loy

It also extracts a batch of region embeddings that correspond to some sub-regions in the overlapped area to be contrasted for region consistency.

Self-Supervised Learning Unsupervised Semantic Segmentation

MMOCR: A Comprehensive Toolbox for Text Detection, Recognition and Understanding

1 code implementation14 Aug 2021 Zhanghui Kuang, Hongbin Sun, Zhizhong Li, Xiaoyu Yue, Tsui Hin Lin, Jianyong Chen, Huaqiang Wei, Yiqin Zhu, Tong Gao, Wenwei Zhang, Kai Chen, Wayne Zhang, Dahua Lin

We present MMOCR-an open-source toolbox which provides a comprehensive pipeline for text detection and recognition, as well as their downstream tasks such as named entity recognition and key information extraction.

Key Information Extraction named-entity-recognition +4

Integrated Satellite-HAP-Terrestrial Networks for Dual-Band Connectivity

no code implementations6 Jul 2021 Wenwei Zhang, Ruoqi Deng, Boya Di, Lingyang Song

The recent development of high-altitude platforms (HAPs) has attracted increasing attention since they can serve as a promising communication method to assist satellite-terrestrial networks.

K-Net: Towards Unified Image Segmentation

1 code implementation NeurIPS 2021 Wenwei Zhang, Jiangmiao Pang, Kai Chen, Chen Change Loy

The framework, named K-Net, segments both instances and semantic categories consistently by a group of learnable kernels, where each kernel is responsible for generating a mask for either a potential instance or a stuff class.

Image Segmentation Instance Segmentation +2

Exploring Data Augmentation for Multi-Modality 3D Object Detection

8 code implementations23 Dec 2020 Wenwei Zhang, Zhe Wang, Chen Change Loy

Due to the fact that multi-modality data augmentation must maintain consistency between point cloud and images, recent methods in this field typically use relatively insufficient data augmentation.

3D Object Detection Autonomous Driving +3

More Information Supervised Probabilistic Deep Face Embedding Learning

no code implementations ICML 2020 Ying Huang, Shangfeng Qiu, Wenwei Zhang, Xianghui Luo, Jinzhuo Wang

Researches using margin based comparison loss demonstrate the effectiveness of penalizing the distance between face feature and their corresponding class centers.

Face Recognition Open Set Learning

Deep Frequent Spatial Temporal Learning for Face Anti-Spoofing

no code implementations20 Jan 2020 Ying Huang, Wenwei Zhang, Jinzhuo Wang

Face anti-spoofing is crucial for the security of face recognition system, by avoiding invaded with presentation attack.

Face Anti-Spoofing Face Recognition

EcoNAS: Finding Proxies for Economical Neural Architecture Search

no code implementations CVPR 2020 Dongzhan Zhou, Xinchi Zhou, Wenwei Zhang, Chen Change Loy, Shuai Yi, Xuesen Zhang, Wanli Ouyang

While many methods have been proposed to improve the efficiency of NAS, the search progress is still laborious because training and evaluating plausible architectures over large search space is time-consuming.

Neural Architecture Search

Side-Aware Boundary Localization for More Precise Object Detection

3 code implementations ECCV 2020 Jiaqi Wang, Wenwei Zhang, Yuhang Cao, Kai Chen, Jiangmiao Pang, Tao Gong, Jianping Shi, Chen Change Loy, Dahua Lin

To tackle the difficulty of precise localization in the presence of displacements with large variance, we further propose a two-step localization scheme, which first predicts a range of movement through bucket prediction and then pinpoints the precise position within the predicted bucket.

Object object-detection +2

Robust Multi-Modality Multi-Object Tracking

1 code implementation ICCV 2019 Wenwei Zhang, Hui Zhou, Shuyang Sun, Zhe Wang, Jianping Shi, Chen Change Loy

Multi-sensor perception is crucial to ensure the reliability and accuracy in autonomous driving system, while multi-object tracking (MOT) improves that by tracing sequential movement of dynamic objects.

Autonomous Driving Multi-Object Tracking +2

Cannot find the paper you are looking for? You can Submit a new open access paper.