1 code implementation • 26 Mar 2024 • Yabin Zhang, Wenjie Zhu, Hui Tang, Zhiyuan Ma, Kaiyang Zhou, Lei Zhang
In this paper, we introduce a versatile adaptation approach that can effectively work under all three settings.
no code implementations • 7 Feb 2024 • Shuoyuan Wang, Jindong Wang, Guoqing Wang, Bob Zhang, Kaiyang Zhou, Hongxin Wei
Vision-language models (VLMs) have emerged as formidable tools, showing their strong capability in handling various open-vocabulary tasks in image recognition, text-driven visual content generation, and visual chatbots, to name a few.
3 code implementations • CVPR 2023 • Jingkang Yang, Wenxuan Peng, Xiangtai Li, Zujin Guo, Liangyu Chen, Bo Li, Zheng Ma, Kaiyang Zhou, Wayne Zhang, Chen Change Loy, Ziwei Liu
PVSG relates to the existing video scene graph generation (VidSGG) problem, which focuses on temporal interactions between humans and objects grounded with bounding boxes in videos.
1 code implementation • 12 Oct 2023 • Jingkang Yang, Yuhao Dong, Shuai Liu, Bo Li, Ziyue Wang, Chencheng Jiang, Haoran Tan, Jiamu Kang, Yuanhan Zhang, Kaiyang Zhou, Ziwei Liu
Large vision-language models (VLMs) have achieved substantial progress in multimodal perception and reasoning.
1 code implementation • 15 Jun 2023 • Jingyang Zhang, Jingkang Yang, Pengyun Wang, Haoqi Wang, Yueqian Lin, Haoran Zhang, Yiyou Sun, Xuefeng Du, Kaiyang Zhou, Wayne Zhang, Yixuan Li, Ziwei Liu, Yiran Chen, Hai Li
Out-of-Distribution (OOD) detection is critical for the reliable operation of open-world intelligent systems.
Out-of-Distribution Detection Out of Distribution (OOD) Detection
1 code implementation • 29 May 2023 • Yuhang Zang, Wei Li, Jun Han, Kaiyang Zhou, Chen Change Loy
Moreover, we present ContextDET, a unified multimodal model that is capable of end-to-end differentiable modeling of visual-language contexts, so as to locate, identify, and associate visual objects with language inputs for human-AI interaction.
no code implementations • 24 May 2023 • Yuhang Zang, Kaiyang Zhou, Chen Huang, Chen Change Loy
This paper focuses on long-tailed object detection in the semi-supervised learning setting, which poses realistic challenges, but has rarely been studied in the literature.
1 code implementation • NeurIPS 2023 • Yuanhan Zhang, Kaiyang Zhou, Ziwei Liu
To overcome the problem, we propose a prompt retrieval framework to automate the selection of in-context examples.
no code implementations • 25 Oct 2022 • Tingwei Wang, Da Li, Kaiyang Zhou, Tao Xiang, Yi-Zhe Song
Machine learning models are intrinsically vulnerable to domain shift between training and testing data, resulting in poor performance in novel domains.
3 code implementations • 13 Oct 2022 • Jingkang Yang, Pengyun Wang, Dejian Zou, Zitang Zhou, Kunyuan Ding, Wenxuan Peng, Haoqi Wang, Guangyao Chen, Bo Li, Yiyou Sun, Xuefeng Du, Kaiyang Zhou, Wayne Zhang, Dan Hendrycks, Yixuan Li, Ziwei Liu
Out-of-distribution (OOD) detection is vital to safety-critical machine learning applications and has thus been extensively studied, with a plethora of methods developed in the literature.
1 code implementation • 13 Oct 2022 • Yuhang Zang, Wei Li, Kaiyang Zhou, Chen Huang, Chen Change Loy
Prompt tuning, a parameter- and data-efficient transfer learning paradigm that tunes only a small number of parameters in a model's input space, has become a trend in the vision community since the emergence of large vision-language models like CLIP.
2 code implementations • 15 Sep 2022 • Kaiyang Zhou, Yuanhan Zhang, Yuhang Zang, Jingkang Yang, Chen Change Loy, Ziwei Liu
Another interesting observation is that the teacher-student gap on out-of-distribution data is bigger than that on in-distribution data, which highlights the capacity mismatch issue as well as the shortcoming of KD.
1 code implementation • 22 Jul 2022 • Jingkang Yang, Yi Zhe Ang, Zujin Guo, Kaiyang Zhou, Wayne Zhang, Ziwei Liu
Existing research addresses scene graph generation (SGG) -- a critical technology for scene understanding in images -- from a detection perspective, i. e., objects are detected using bounding boxes followed by prediction of their pairwise relationships.
Ranked #5 on Panoptic Scene Graph Generation on PSG Dataset
1 code implementation • 17 Jul 2022 • Kaiyang Zhou, Adeline Paiement, Majid Mirmehdi
We address the problem of people detection in RGB-D data where we leverage depth information to develop a region-of-interest (ROI) selection method that provides proposals to two color and depth CNNs.
1 code implementation • 9 Jun 2022 • Yuanhan Zhang, Kaiyang Zhou, Ziwei Liu
The size of vision models has grown exponentially over the last few years, especially after the emergence of Vision Transformer.
Ranked #1 on Image Classification on OmniBenchmark (using extra training data)
1 code implementation • 11 Apr 2022 • Jingkang Yang, Kaiyang Zhou, Ziwei Liu
In this paper, we take into account both shift types and introduce full-spectrum OOD (FS-OOD) detection, a more realistic problem setting that considers both detecting semantic shift and being tolerant to covariate shift; and designs three benchmarks.
Out-of-Distribution Detection Out of Distribution (OOD) Detection
1 code implementation • 22 Mar 2022 • Yuhang Zang, Wei Li, Kaiyang Zhou, Chen Huang, Chen Change Loy
To this end, we propose a novel open-vocabulary detector based on DETR -- hence the name OV-DETR -- which, once trained, can detect any object given its class name or an exemplar image.
Ranked #21 on Open Vocabulary Object Detection on MSCOCO
9 code implementations • CVPR 2022 • Kaiyang Zhou, Jingkang Yang, Chen Change Loy, Ziwei Liu
With the rise of powerful pre-trained vision-language models like CLIP, it becomes essential to investigate ways to adapt these models to downstream datasets.
Ranked #3 on Prompt Engineering on ImageNet V2
1 code implementation • 9 Mar 2022 • Zhongying Deng, Kaiyang Zhou, Da Li, Junjun He, Yi-Zhe Song, Tao Xiang
In this paper, we address both single-source and multi-source UDA from a completely different perspective, which is to view each instance as a fine domain.
1 code implementation • 6 Nov 2021 • Zhongying Deng, Kaiyang Zhou, Yongxin Yang, Tao Xiang
Importantly, the attention module is supervised by a consistency loss, which is imposed on the distributions of channel attention weights between source and target domains.
3 code implementations • 21 Oct 2021 • Jingkang Yang, Kaiyang Zhou, Yixuan Li, Ziwei Liu
In this survey, we first present a unified framework called generalized OOD detection, which encompasses the five aforementioned problems, i. e., AD, ND, OSR, OOD detection, and OD.
13 code implementations • 2 Sep 2021 • Kaiyang Zhou, Jingkang Yang, Chen Change Loy, Ziwei Liu
Large pre-trained vision-language models like CLIP have shown great potential in learning representations that are transferable across a wide range of downstream tasks.
Ranked #2 on Few-shot Age Estimation on MORPH Album2
no code implementations • ICCV 2021 • Yezhen Wang, Bo Li, Tong Che, Kaiyang Zhou, Ziwei Liu, Dongsheng Li
Confidence calibration is of great importance to the reliability of decisions made by machine learning systems.
2 code implementations • 5 Jul 2021 • Kaiyang Zhou, Yongxin Yang, Yu Qiao, Tao Xiang
MixStyle is easy to implement with a few lines of code, does not require modification to training objectives, and can fit a variety of learning paradigms including supervised domain generalization, semi-supervised domain generalization, and unsupervised domain adaptation.
2 code implementations • 1 Jun 2021 • Kaiyang Zhou, Chen Change Loy, Ziwei Liu
We find that the DG methods, which by design are unable to handle unlabeled data, perform poorly with limited labels in SSDG; the SSL methods, especially FixMatch, obtain much better results but are still far away from the basic vanilla model trained using full labels.
3 code implementations • ICLR 2021 • Kaiyang Zhou, Yongxin Yang, Yu Qiao, Tao Xiang
Our method, termed MixStyle, is motivated by the observation that visual domain is closely related to image style (e. g., photo vs.~sketch images).
Ranked #57 on Domain Generalization on PACS
2 code implementations • 3 Mar 2021 • Kaiyang Zhou, Ziwei Liu, Yu Qiao, Tao Xiang, Chen Change Loy
Generalization to out-of-distribution (OOD) data is a capability natural to humans yet challenging for machines to reproduce.
1 code implementation • ECCV 2020 • Kaiyang Zhou, Yongxin Yang, Timothy Hospedales, Tao Xiang
This explicitly increases the diversity of available training domains and leads to a more generalizable model.
Ranked #65 on Domain Generalization on PACS
1 code implementation • 16 Mar 2020 • Kaiyang Zhou, Yongxin Yang, Yu Qiao, Tao Xiang
Each such classifier is an expert to its own domain and a non-expert to others.
no code implementations • 12 Mar 2020 • Kaiyang Zhou, Yongxin Yang, Timothy Hospedales, Tao Xiang
This is achieved by having a learning objective formulated to ensure that the generated data can be correctly classified by the label classifier while fooling the domain classifier.
Ranked #63 on Domain Generalization on PACS
8 code implementations • 22 Oct 2019 • Kaiyang Zhou, Tao Xiang
Person re-identification (re-ID), which aims to re-identify people across different camera views, has been significantly advanced by deep learning in recent years, particularly with convolutional neural networks (CNNs).
8 code implementations • 15 Oct 2019 • Kaiyang Zhou, Yongxin Yang, Andrea Cavallaro, Tao Xiang
An effective person re-identification (re-ID) model should learn feature representations that are both discriminative, for distinguishing similar-looking people, and generalisable, for deployment across datasets without any adaptation.
Unsupervised Domain Adaptation Unsupervised Person Re-Identification
16 code implementations • ICCV 2019 • Kaiyang Zhou, Yongxin Yang, Andrea Cavallaro, Tao Xiang
As an instance-level recognition problem, person re-identification (ReID) relies on discriminative features, which not only capture different spatial scales but also encapsulate an arbitrary combination of multiple scales.
Ranked #2 on Person Re-Identification on MSMT17-C
no code implementations • 9 Jul 2018 • Kaiyang Zhou, Tao Xiang, Andrea Cavallaro
Most existing video summarisation methods are based on either supervised or unsupervised learning.
6 code implementations • 29 Dec 2017 • Kaiyang Zhou, Yu Qiao, Tao Xiang
Video summarization aims to facilitate large-scale video browsing by producing short, concise summaries that are diverse and representative of original videos.
Ranked #7 on Unsupervised Video Summarization on TvSum