TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK	REMOVE
Zero-Shot Text Classification	AG News	ReGen	Accuracy (%)	85.0	# 1
Zero-Shot Text Classification	AG News	Mining	Accuracy (%)	79.2	# 2

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/regen-zero-shot-text-classification-via/zero-shot-text-classification-on-ag-news)](https://paperswithcode.com/sota/zero-shot-text-classification-on-ag-news?p=regen-zero-shot-text-classification-via)`

ReGen: Zero-Shot Text Classification via Training Data Generation with Progressive Dense Retrieval

18 May 2023 · Yue Yu, Yuchen Zhuang, Rongzhi Zhang, Yu Meng, Jiaming Shen, Chao Zhang ·

With the development of large language models (LLMs), zero-shot learning has attracted much attention for various NLP tasks. Different from prior works that generate training data with billion-scale natural language generation (NLG) models, we propose a retrieval-enhanced framework to create training data from a general-domain unlabeled corpus. To realize this, we first conduct contrastive pretraining to learn an unsupervised dense retriever for extracting the most relevant documents using class-descriptive verbalizers. We then further propose two simple strategies, namely Verbalizer Augmentation with Demonstrations and Self-consistency Guided Filtering to improve the topic coverage of the dataset while removing noisy examples. Experiments on nine datasets demonstrate that REGEN achieves 4.3% gain over the strongest baselines and saves around 70% of the time compared to baselines using large NLG models. Besides, REGEN can be naturally integrated with recently proposed large language models to boost performance.

PDF Abstract