TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Stereotypical Bias Analysis	CrowS-Pairs	GPT-3	Gender	62.6	# 2
Stereotypical Bias Analysis	CrowS-Pairs	GPT-3	Religion	62.6	# 2
Stereotypical Bias Analysis	CrowS-Pairs	GPT-3	Race/Color	64.7	# 3
Stereotypical Bias Analysis	CrowS-Pairs	GPT-3	Sexual Orientation	76.2	# 1
Stereotypical Bias Analysis	CrowS-Pairs	GPT-3	Age	64.4	# 1
Stereotypical Bias Analysis	CrowS-Pairs	GPT-3	Nationality	61.6	# 2
Stereotypical Bias Analysis	CrowS-Pairs	GPT-3	Disability	76.7	# 3
Stereotypical Bias Analysis	CrowS-Pairs	GPT-3	Physical Appearance	74.6	# 2
Stereotypical Bias Analysis	CrowS-Pairs	GPT-3	Socioeconomic status	73.8	# 3
Stereotypical Bias Analysis	CrowS-Pairs	GPT-3	Overall	67.2	# 2
Stereotypical Bias Analysis	CrowS-Pairs	OPT-175B	Gender	65.7	# 3
Stereotypical Bias Analysis	CrowS-Pairs	OPT-175B	Religion	65.7	# 3
Stereotypical Bias Analysis	CrowS-Pairs	OPT-175B	Race/Color	68.6	# 4
Stereotypical Bias Analysis	CrowS-Pairs	OPT-175B	Sexual Orientation	78.6	# 3
Stereotypical Bias Analysis	CrowS-Pairs	OPT-175B	Age	67.8	# 2
Stereotypical Bias Analysis	CrowS-Pairs	OPT-175B	Nationality	62.9	# 3
Stereotypical Bias Analysis	CrowS-Pairs	OPT-175B	Disability	76.7	# 3
Stereotypical Bias Analysis	CrowS-Pairs	OPT-175B	Physical Appearance	76.2	# 3
Stereotypical Bias Analysis	CrowS-Pairs	OPT-175B	Socioeconomic status	76.2	# 4
Stereotypical Bias Analysis	CrowS-Pairs	OPT-175B	Overall	69.5	# 1
Hate Speech Detection	Ethos Binary	OPT-175B (few-shot)	F1-score	0.759	# 4
Hate Speech Detection	Ethos Binary	Davinci (few-shot)	F1-score	0.354	# 12
Hate Speech Detection	Ethos Binary	OPT-175B (one-shot)	F1-score	0.713	# 6
Hate Speech Detection	Ethos Binary	Davinci (one-shot)	F1-score	0.616	# 11
Hate Speech Detection	Ethos Binary	OPT-175B (zero-shot)	F1-score	0.667	# 7
Hate Speech Detection	Ethos Binary	Davinci (zero-shot)	F1-score	0.628	# 10

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/opt-open-pre-trained-transformer-language/stereotypical-bias-analysis-on-crows-pairs)](https://paperswithcode.com/sota/stereotypical-bias-analysis-on-crows-pairs?p=opt-open-pre-trained-transformer-language)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/opt-open-pre-trained-transformer-language/hate-speech-detection-on-ethos-binary)](https://paperswithcode.com/sota/hate-speech-detection-on-ethos-binary?p=opt-open-pre-trained-transformer-language)`

OPT: Open Pre-trained Transformer Language Models

2 May 2022 · Susan Zhang, Stephen Roller, Naman Goyal, Mikel Artetxe, Moya Chen, Shuohui Chen, Christopher Dewan, Mona Diab, Xian Li, Xi Victoria Lin, Todor Mihaylov, Myle Ott, Sam Shleifer, Kurt Shuster, Daniel Simig, Punit Singh Koura, Anjali Sridhar, Tianlu Wang, Luke Zettlemoyer ·

Large language models, which are often trained for hundreds of thousands of compute days, have shown remarkable capabilities for zero- and few-shot learning. Given their computational cost, these models are difficult to replicate without significant capital. For the few that are available through APIs, no access is granted to the full model weights, making them difficult to study. We present Open Pre-trained Transformers (OPT), a suite of decoder-only pre-trained transformers ranging from 125M to 175B parameters, which we aim to fully and responsibly share with interested researchers. We show that OPT-175B is comparable to GPT-3, while requiring only 1/7th the carbon footprint to develop. We are also releasing our logbook detailing the infrastructure challenges we faced, along with code for experimenting with all of the released models.

PDF Abstract

Code

Add Remove Mark official

facebookresearch/metaseq official

6,387

pku-alignment/safe-rlhf

1,154

xvyaward/owq

xiuyu0000/new_papers_codes

znhy1024/protoco

See all 7 implementations

Tasks

Add Remove

Hate Speech Detection

Language Modelling

Stereotypical Bias Analysis

Datasets

test

SuperGLUE

WSC

The Pile

Wizard of Wikipedia StereoSet CrowS-Pairs

ConvAI2

ARC (AI2 Reasoning Challenge) Blended Skill Talk ETHOS

Results from the Paper

Add Remove

Ranked #2 on Stereotypical Bias Analysis on CrowS-Pairs

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Stereotypical Bias Analysis	CrowS-Pairs	GPT-3	Gender	62.6	# 2	Compare
			Religion	62.6	# 2	Compare
			Race/Color	64.7	# 3	Compare
			Sexual Orientation	76.2	# 1	Compare
			Age	64.4	# 1	Compare
			Nationality	61.6	# 2	Compare
			Disability	76.7	# 3	Compare
			Physical Appearance	74.6	# 2	Compare
			Socioeconomic status	73.8	# 3	Compare
			Overall	67.2	# 2	Compare
Stereotypical Bias Analysis	CrowS-Pairs	OPT-175B	Gender	65.7	# 3	Compare
			Religion	65.7	# 3	Compare
			Race/Color	68.6	# 4	Compare
			Sexual Orientation	78.6	# 3	Compare
			Age	67.8	# 2	Compare
			Nationality	62.9	# 3	Compare
			Disability	76.7	# 3	Compare
			Physical Appearance	76.2	# 3	Compare
			Socioeconomic status	76.2	# 4	Compare
			Overall	69.5	# 1	Compare
Hate Speech Detection	Ethos Binary	OPT-175B (few-shot)	F1-score	0.759	# 4	Compare
Hate Speech Detection	Ethos Binary	Davinci (few-shot)	F1-score	0.354	# 12	Compare
Hate Speech Detection	Ethos Binary	OPT-175B (one-shot)	F1-score	0.713	# 6	Compare
Hate Speech Detection	Ethos Binary	Davinci (one-shot)	F1-score	0.616	# 11	Compare
Hate Speech Detection	Ethos Binary	OPT-175B (zero-shot)	F1-score	0.667	# 7	Compare
Hate Speech Detection	Ethos Binary	Davinci (zero-shot)	F1-score	0.628	# 10	Compare

Methods

Add Remove

Adam • Attention Dropout • BPE • Cosine Annealing • Dense Connections • Dropout • Fixed Factorized Attention • GELU • GPT-3 • Layer Normalization • Linear Layer • Linear Warmup With Cosine Annealing • Multi-Head Attention • OPT • Residual Connection • Scaled Dot-Product Attention • Softmax • Strided Attention • Weight Decay

Edit Social Preview

OPT: Open Pre-trained Transformer Language Models

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit Add Remove

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Add Remove

Methods

Add Remove