Common Sense Reasoning

254 papers with code • 24 benchmarks • 52 datasets

Common sense reasoning tasks are intended to require the model to go beyond pattern recognition. Instead, the model should use "common sense" or world knowledge to make inferences.

Benchmarks

Add a Result

These leaderboards are used to track progress in Common Sense Reasoning

Dataset	Best Model	Compare
WinoGrande	ST-MoE-32B 269B (fine-tuned)	See all
ARC (Challenge)	GPT-4 (few-shot, k=25)	See all
ARC (Easy)	ST-MoE-32B 269B (fine-tuned)	See all
CommonsenseQA	DeBERTaV3-large+KEAR	See all
ReCoRD	Turing NLR v5 XXL 5.4B (fine-tuned)	See all
BIG-bench (Disambiguation QA)	PaLM 2 (few-shot, k=3, Direct)	See all
BIG-bench (Causal Judgment)	PaLM 2 (few-shot, k=3, Direct)	See all
BIG-bench (Date Understanding)	PaLM 2 (few-shot, k=3, CoT)	See all
BIG-bench (Sports Understanding)	PaLM 2(few-shot, k=3, CoT)	See all
Event2Mind test	ConvNet	See all
Russian Event2Mind	ruscorpora word2vec (skipgram) + GRU	See all
RuCoS	Human Benchmark	See all
RWSD	Golden Transformer	See all
PARus	Human Benchmark	See all
SWAG	DeBERTalarge	See all
BIG-bench (Winowhy)	PaLM-540B (few-shot, k=5)	See all
BIG-bench (Known Unknowns)	PaLM-540B (few-shot, k=5)	See all
Event2Mind dev	ConvNet	See all
BIG-bench (Logical Sequence)	Chinchilla-70B (few-shot, k=5)	See all
CODAH	BERT Large	See all
Visual Dialog v0.9	PDUN	See all
CrowdSource QA	BERT	See all
Visual Dialog v0.9	NMN [kottur2018visual]	See all
WinoGAViL	ViLT	See all

Show all 24 benchmarks

Collapse benchmarks

Libraries

Use these libraries to find Common Sense Reasoning models and implementations

huggingface/transformers

9 papers

124,793

Tencent/TurboTransformers

3 papers

1,442

Leeroo-AI/mergoo

3 papers

202

volcengine/vegiantmodel

3 papers

197

See all 23 libraries.

Datasets

Subtasks

Anachronisms

Discourse Marker Prediction

Visual Commonsense Tests

Multiview Contextual Commonsense Inference

Most implemented papers

Most implemented Social Latest No code

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

google-research/bert • • NAACL 2019

We introduce a new language representation model called BERT, which stands for Bidirectional Encoder Representations from Transformers.

528

Paper
Code

RoBERTa: A Robustly Optimized BERT Pretraining Approach

pytorch/fairseq • • 26 Jul 2019

Language model pretraining has led to significant performance gains but careful comparison between different approaches is challenging.

Paper
Code

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

huggingface/transformers • • arXiv 2019

Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on a downstream task, has emerged as a powerful technique in natural language processing (NLP).

Paper
Code

ALBERT: A Lite BERT for Self-supervised Learning of Language Representations

google-research/ALBERT • • ICLR 2020

Increasing model size when pretraining natural language representations often results in improved performance on downstream tasks.

Paper
Code

Language Models are Few-Shot Learners

openai/gpt-3 • NeurIPS 2020

By contrast, humans can generally perform a new language task from only a few examples or from simple instructions - something which current NLP systems still largely struggle to do.

Paper
Code

LLaMA: Open and Efficient Foundation Language Models

facebookresearch/llama • • arXiv 2023

We introduce LLaMA, a collection of foundation language models ranging from 7B to 65B parameters.

Paper
Code

A Neural Conversational Model

farizrahman4u/seq2seq • 19 Jun 2015

We find that this straightforward model can generate simple conversations given a large conversational training dataset.

Paper
Code

Language Models are Unsupervised Multitask Learners

openai/gpt-2 • • Preprint 2019

Natural language processing tasks, such as question answering, machine translation, reading comprehension, and summarization, are typically approached with supervised learning on taskspecific datasets.

Paper
Code

Chain-of-Thought Prompting Elicits Reasoning in Large Language Models

microsoft/guidance • 28 Jan 2022

We explore how generating a chain of thought -- a series of intermediate reasoning steps -- significantly improves the ability of large language models to perform complex reasoning.

Paper
Code

Mamba: Linear-Time Sequence Modeling with Selective State Spaces

state-spaces/mamba • • 1 Dec 2023

Foundation models, now powering most of the exciting applications in deep learning, are almost universally based on the Transformer architecture and its core attention module.

Paper
Code

Common Sense Reasoning

Benchmarks Add a Result

Libraries

Datasets

Subtasks

Most implemented papers

Content

Benchmarks

Add a Result