TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Question Answering	COPA	BERT-large 340M	Accuracy	80.8	# 36
Question Answering	COPA	BERT-SocialIQA 340M	Accuracy	83.4	# 33
Question Answering	SIQA	Random chance baseline	Accuracy	33.3	# 20
Question Answering	SIQA	BERT-base 110M (fine-tuned)	Accuracy	63.1	# 10
Question Answering	SIQA	GPT-1 117M (fine-tuned)	Accuracy	63	# 11
Question Answering	SIQA	BERT-large 340M (fine-tuned)	Accuracy	64.5	# 9
Coreference Resolution	Winograd Schema Challenge	BERT-large 340M	Accuracy	67	# 39
Coreference Resolution	Winograd Schema Challenge	BERT-SocialIQA 340M	Accuracy	72.5	# 30

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/socialiqa-commonsense-reasoning-about-social/question-answering-on-social-iqa)](https://paperswithcode.com/sota/question-answering-on-social-iqa?p=socialiqa-commonsense-reasoning-about-social)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/socialiqa-commonsense-reasoning-about-social/coreference-resolution-on-winograd-schema)](https://paperswithcode.com/sota/coreference-resolution-on-winograd-schema?p=socialiqa-commonsense-reasoning-about-social)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/socialiqa-commonsense-reasoning-about-social/question-answering-on-copa)](https://paperswithcode.com/sota/question-answering-on-copa?p=socialiqa-commonsense-reasoning-about-social)`

SocialIQA: Commonsense Reasoning about Social Interactions

22 Apr 2019 · Maarten Sap, Hannah Rashkin, Derek Chen, Ronan LeBras, Yejin Choi ·

We introduce Social IQa, the first largescale benchmark for commonsense reasoning about social situations. Social IQa contains 38,000 multiple choice questions for probing emotional and social intelligence in a variety of everyday situations (e.g., Q: "Jordan wanted to tell Tracy a secret, so Jordan leaned towards Tracy. Why did Jordan do this?" A: "Make sure no one else could hear"). Through crowdsourcing, we collect commonsense questions along with correct and incorrect answers about social interactions, using a new framework that mitigates stylistic artifacts in incorrect answers by asking workers to provide the right answer to a different but related question. Empirical results show that our benchmark is challenging for existing question-answering models based on pretrained language models, compared to human performance (>20% gap). Notably, we further establish Social IQa as a resource for transfer learning of commonsense knowledge, achieving state-of-the-art performance on multiple commonsense reasoning tasks (Winograd Schemas, COPA).

PDF Abstract

Code

Add Remove Mark official

No code implementations yet. Submit your code now

Tasks

Add Remove

Common Sense Reasoning

Coreference Resolution

Multiple-choice

Question Answering

Transfer Learning

Datasets

Introduced in the Paper:

SIQA

Used in the Paper:

CommonsenseQA

BookCorpus

WSC

COPA

Results from the Paper

Edit

Ranked #9 on Question Answering on SIQA

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Question Answering	COPA	BERT-large 340M	Accuracy	80.8	# 36	Compare
Question Answering	COPA	BERT-SocialIQA 340M	Accuracy	83.4	# 33	Compare
Question Answering	SIQA	Random chance baseline	Accuracy	33.3	# 20	Compare
Question Answering	SIQA	BERT-base 110M (fine-tuned)	Accuracy	63.1	# 10	Compare
Question Answering	SIQA	GPT-1 117M (fine-tuned)	Accuracy	63	# 11	Compare
Question Answering	SIQA	BERT-large 340M (fine-tuned)	Accuracy	64.5	# 9	Compare
Coreference Resolution	Winograd Schema Challenge	BERT-large 340M	Accuracy	67	# 39	Compare
Coreference Resolution	Winograd Schema Challenge	BERT-SocialIQA 340M	Accuracy	72.5	# 30	Compare

Methods

Add Remove

No methods listed for this paper. Add relevant methods here

Edit Social Preview

SocialIQA: Commonsense Reasoning about Social Interactions

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove