TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Cross-Part Crowd Counting	ShanghaiTech A	CrowdCLIP	MAE	217	# 2
Cross-Part Crowd Counting	ShanghaiTech A	CrowdCLIP	RMSE	322.7	# 2
Cross-Part Crowd Counting	ShanghaiTech A	MCNN	MAE	221.4	# 1
Cross-Part Crowd Counting	ShanghaiTech A	MCNN	RMSE	357.8	# 3
Cross-Part Crowd Counting	ShanghaiTech B	MCNN	MAE	85.2	# 1
Cross-Part Crowd Counting	ShanghaiTech B	MCNN	RMSE	142.3	# 3
Cross-Part Crowd Counting	ShanghaiTech B	CrowdCLIP	MAE	69.6	# 2
Cross-Part Crowd Counting	ShanghaiTech B	CrowdCLIP	RMSE	80.7	# 2

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/crowdclip-unsupervised-crowd-counting-via/cross-part-crowd-counting-on-shanghaitech-a)](https://paperswithcode.com/sota/cross-part-crowd-counting-on-shanghaitech-a?p=crowdclip-unsupervised-crowd-counting-via)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/crowdclip-unsupervised-crowd-counting-via/cross-part-crowd-counting-on-shanghaitech-b)](https://paperswithcode.com/sota/cross-part-crowd-counting-on-shanghaitech-b?p=crowdclip-unsupervised-crowd-counting-via)`

CrowdCLIP: Unsupervised Crowd Counting via Vision-Language Model

CVPR 2023 · Dingkang Liang, Jiahao Xie, Zhikang Zou, Xiaoqing Ye, Wei Xu, Xiang Bai ·

Supervised crowd counting relies heavily on costly manual labeling, which is difficult and expensive, especially in dense scenes. To alleviate the problem, we propose a novel unsupervised framework for crowd counting, named CrowdCLIP. The core idea is built on two observations: 1) the recent contrastive pre-trained vision-language model (CLIP) has presented impressive performance on various downstream tasks; 2) there is a natural mapping between crowd patches and count text. To the best of our knowledge, CrowdCLIP is the first to investigate the vision language knowledge to solve the counting problem. Specifically, in the training stage, we exploit the multi-modal ranking loss by constructing ranking text prompts to match the size-sorted crowd patches to guide the image encoder learning. In the testing stage, to deal with the diversity of image patches, we propose a simple yet effective progressive filtering strategy to first select the highly potential crowd patches and then map them into the language space with various counting intervals. Extensive experiments on five challenging datasets demonstrate that the proposed CrowdCLIP achieves superior performance compared to previous unsupervised state-of-the-art counting methods. Notably, CrowdCLIP even surpasses some popular fully-supervised methods under the cross-dataset setting. The source code will be available at https://github.com/dk-liang/CrowdCLIP.

PDF Abstract CVPR 2023 PDF CVPR 2023 Abstract

Code

Add Remove Mark official

dk-liang/crowdclip official

raywang335/l2rclip

Tasks

Add Remove

Cross-Part Crowd Counting

Crowd Counting

Language Modelling

Datasets

ShanghaiTech

UCF-QNRF

JHU-CROWD++

Results from the Paper

Edit

Ranked #1 on Cross-Part Crowd Counting on ShanghaiTech B

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Cross-Part Crowd Counting	ShanghaiTech A	CrowdCLIP	MAE	217	# 2	Compare
Cross-Part Crowd Counting	ShanghaiTech A	CrowdCLIP	RMSE	322.7	# 2	Compare
Cross-Part Crowd Counting	ShanghaiTech A	MCNN	MAE	221.4	# 1	Compare
Cross-Part Crowd Counting	ShanghaiTech A	MCNN	RMSE	357.8	# 3	Compare
Cross-Part Crowd Counting	ShanghaiTech B	MCNN	MAE	85.2	# 1	Compare
Cross-Part Crowd Counting	ShanghaiTech B	MCNN	RMSE	142.3	# 3	Compare
Cross-Part Crowd Counting	ShanghaiTech B	CrowdCLIP	MAE	69.6	# 2	Compare
Cross-Part Crowd Counting	ShanghaiTech B	CrowdCLIP	RMSE	80.7	# 2	Compare

Methods

Add Remove

No methods listed for this paper. Add relevant methods here

Edit Social Preview

CrowdCLIP: Unsupervised Crowd Counting via Vision-Language Model

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove