TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Common Sense Reasoning	ARC (Easy)	LLaMA 13B + CFG (0-shot)	Accuracy	79.1	# 15
Common Sense Reasoning	ARC (Easy)	LLaMA 30B + CFG (0-shot)	Accuracy	83.2	# 8
Common Sense Reasoning	ARC (Easy)	LLaMA 65B + CFG (0-shot)	Accuracy	84.2	# 6
Common Sense Reasoning	ARC (Easy)	LLaMA 7B + CFG (0-shot)	Accuracy	58.9	# 38
Sentence Completion	HellaSwag	LLaMA 13B + CFG (0-shot)	Accuracy	82.1	# 35
Sentence Completion	HellaSwag	LLaMA 30B + CFG (0-shot)	Accuracy	85.3	# 21
Sentence Completion	HellaSwag	LLaMA 65B + CFG (0-shot)	Accuracy	86.3	# 16
Language Modelling	LAMBADA	LLaMA-30B+CFG (zero-shot)	Accuracy	83.9	# 5
Language Modelling	LAMBADA	LLaMA-13B+CFG (zero-shot)	Accuracy	82.2	# 8
Language Modelling	LAMBADA	LLaMA-65B+CFG (Zero-Shot)	Accuracy	84.0	# 4
Text Generation	SciQ	LLaMA-65B+CFG (zero-shot)	Accuracy	96.6	# 1
Text Generation	SciQ	LLaMA-30B+CFG (zero-shot)	Accuracy	96.4	# 2
Text Generation	SciQ	LLaMA-13B+CFG (zero-shot)	Accuracy	95.1	# 3

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/stay-on-topic-with-classifier-free-guidance/text-generation-on-sciq)](https://paperswithcode.com/sota/text-generation-on-sciq?p=stay-on-topic-with-classifier-free-guidance)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/stay-on-topic-with-classifier-free-guidance/language-modelling-on-lambada)](https://paperswithcode.com/sota/language-modelling-on-lambada?p=stay-on-topic-with-classifier-free-guidance)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/stay-on-topic-with-classifier-free-guidance/common-sense-reasoning-on-arc-easy)](https://paperswithcode.com/sota/common-sense-reasoning-on-arc-easy?p=stay-on-topic-with-classifier-free-guidance)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/stay-on-topic-with-classifier-free-guidance/sentence-completion-on-hellaswag)](https://paperswithcode.com/sota/sentence-completion-on-hellaswag?p=stay-on-topic-with-classifier-free-guidance)`

Stay on topic with Classifier-Free Guidance

30 Jun 2023 · Guillaume Sanchez, Honglu Fan, Alexander Spangher, Elad Levi, Pawan Sasanka Ammanamanchi, Stella Biderman ·

Classifier-Free Guidance (CFG) has recently emerged in text-to-image generation as a lightweight technique to encourage prompt-adherence in generations. In this work, we demonstrate that CFG can be used broadly as an inference-time technique in pure language modeling. We show that CFG (1) improves the performance of Pythia, GPT-2 and LLaMA-family models across an array of tasks: Q\&A, reasoning, code generation, and machine translation, achieving SOTA on LAMBADA with LLaMA-7B over PaLM-540B; (2) brings improvements equivalent to a model with twice the parameter-count; (3) can stack alongside other inference-time methods like Chain-of-Thought and Self-Consistency, yielding further improvements in difficult tasks; (4) can be used to increase the faithfulness and coherence of assistants in challenging form-driven and content-driven prompts: in a human evaluation we show a 75\% preference for GPT4All using CFG over baseline.

PDF Abstract

Code

Add Remove Mark official

No code implementations yet. Submit your code now

Tasks

Add Remove

Code Generation

Common Sense Reasoning

Image Generation

LAMBADA

Language Modelling

Machine Translation

Sentence Completion

Text Generation

Text-to-Image Generation

Zero-Shot Learning

Datasets

Introduced in the Paper:

SUDOER

Used in the Paper:

GSM8K

HumanEval

HellaSwag

BoolQ

PIQA

WinoGrande

LAMBADA

ARC (AI2 Reasoning Challenge)

SciQ

Results from the Paper

Edit

Ranked #1 on Text Generation on SciQ

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Common Sense Reasoning	ARC (Easy)	LLaMA 13B + CFG (0-shot)	Accuracy	79.1	# 15	Compare
Common Sense Reasoning	ARC (Easy)	LLaMA 30B + CFG (0-shot)	Accuracy	83.2	# 8	Compare
Common Sense Reasoning	ARC (Easy)	LLaMA 65B + CFG (0-shot)	Accuracy	84.2	# 6	Compare
Common Sense Reasoning	ARC (Easy)	LLaMA 7B + CFG (0-shot)	Accuracy	58.9	# 38	Compare
Sentence Completion	HellaSwag	LLaMA 13B + CFG (0-shot)	Accuracy	82.1	# 35	Compare
Sentence Completion	HellaSwag	LLaMA 30B + CFG (0-shot)	Accuracy	85.3	# 21	Compare
Sentence Completion	HellaSwag	LLaMA 65B + CFG (0-shot)	Accuracy	86.3	# 16	Compare
Language Modelling	LAMBADA	LLaMA-30B+CFG (zero-shot)	Accuracy	83.9	# 5	Compare
Language Modelling	LAMBADA	LLaMA-13B+CFG (zero-shot)	Accuracy	82.2	# 8	Compare
Language Modelling	LAMBADA	LLaMA-65B+CFG (Zero-Shot)	Accuracy	84.0	# 4	Compare
Text Generation	SciQ	LLaMA-65B+CFG (zero-shot)	Accuracy	96.6	# 1	Compare
Text Generation	SciQ	LLaMA-30B+CFG (zero-shot)	Accuracy	96.4	# 2	Compare
Text Generation	SciQ	LLaMA-13B+CFG (zero-shot)	Accuracy	95.1	# 3	Compare

Methods

Add Remove

Adam • Attention Dropout • BPE • Cosine Annealing • Dense Connections • Discriminative Fine-Tuning • Dropout • GELU • GPT-2 • Layer Normalization • Linear Layer • Linear Warmup With Cosine Annealing • Multi-Head Attention • Pythia • Residual Connection • Scaled Dot-Product Attention • Softmax • Weight Decay

Edit Social Preview

Stay on topic with Classifier-Free Guidance

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove