Common Sense Reasoning

254 papers with code • 24 benchmarks • 52 datasets

Common sense reasoning tasks are intended to require the model to go beyond pattern recognition. Instead, the model should use "common sense" or world knowledge to make inferences.

Benchmarks

Add a Result

These leaderboards are used to track progress in Common Sense Reasoning

Dataset	Best Model	Compare
WinoGrande	ST-MoE-32B 269B (fine-tuned)	See all
ARC (Challenge)	GPT-4 (few-shot, k=25)	See all
ARC (Easy)	ST-MoE-32B 269B (fine-tuned)	See all
CommonsenseQA	DeBERTaV3-large+KEAR	See all
ReCoRD	Turing NLR v5 XXL 5.4B (fine-tuned)	See all
BIG-bench (Disambiguation QA)	PaLM 2 (few-shot, k=3, Direct)	See all
BIG-bench (Causal Judgment)	PaLM 2 (few-shot, k=3, Direct)	See all
BIG-bench (Date Understanding)	PaLM 2 (few-shot, k=3, CoT)	See all
BIG-bench (Sports Understanding)	PaLM 2(few-shot, k=3, CoT)	See all
Event2Mind test	ConvNet	See all
Russian Event2Mind	ruscorpora word2vec (skipgram) + GRU	See all
RuCoS	Human Benchmark	See all
RWSD	Golden Transformer	See all
PARus	Human Benchmark	See all
SWAG	DeBERTalarge	See all
BIG-bench (Winowhy)	PaLM-540B (few-shot, k=5)	See all
BIG-bench (Known Unknowns)	PaLM-540B (few-shot, k=5)	See all
Event2Mind dev	ConvNet	See all
BIG-bench (Logical Sequence)	Chinchilla-70B (few-shot, k=5)	See all
CODAH	BERT Large	See all
Visual Dialog v0.9	PDUN	See all
CrowdSource QA	BERT	See all
Visual Dialog v0.9	NMN [kottur2018visual]	See all
WinoGAViL	ViLT	See all

Show all 24 benchmarks

Collapse benchmarks

Libraries

Use these libraries to find Common Sense Reasoning models and implementations

huggingface/transformers

9 papers

124,593

Tencent/TurboTransformers

3 papers

1,440

volcengine/vegiantmodel

3 papers

196

Leeroo-AI/mergoo

3 papers

179

See all 23 libraries.

Datasets

Subtasks

Anachronisms

Discourse Marker Prediction

Visual Commonsense Tests

Multiview Contextual Commonsense Inference

Latest papers with no code

Most implemented Social Latest No code

Leveraging Large Language Model-based Room-Object Relationships Knowledge for Enhancing Multimodal-Input Object Goal Navigation

no code yet • 21 Mar 2024

In this study, we propose a data-driven, modular-based approach, trained on a dataset that incorporates common-sense knowledge of object-to-room relationships extracted from a large language model.

Paper
Add Code

To Help or Not to Help: LLM-based Attentive Support for Human-Robot Group Interactions

no code yet • 19 Mar 2024

In addition to following user instructions, Attentive Support is capable of deciding when and how to support the humans, and when to remain silent to not disturb the group.

Paper
Add Code

LogicalDefender: Discovering, Extracting, and Utilizing Common-Sense Knowledge

no code yet • 18 Mar 2024

Experiments show that our model has achieved better logical performance, and the extracted logical knowledge can be effectively applied to other scenarios.

Paper
Add Code

PhD: A Prompted Visual Hallucination Evaluation Dataset

no code yet • 17 Mar 2024

The rapid growth of Large Language Models (LLMs) has driven the development of Large Vision-Language Models (LVLMs).

Paper
Add Code

ContextGPT: Infusing LLMs Knowledge into Neuro-Symbolic Activity Recognition Models

no code yet • 11 Mar 2024

Neuro-Symbolic AI (NeSy) provides an interesting research direction to mitigate this issue, by infusing common-sense knowledge about human activities and the contexts in which they can be performed into HAR deep learning classifiers.

Paper
Add Code

How to Understand Named Entities: Using Common Sense for News Captioning

no code yet • 11 Mar 2024

Our approach consists of three modules: (a) Filter Module aims to clarify the common sense concerning a named entity from two aspects: what does it mean?

Paper
Add Code

Repeated Padding as Data Augmentation for Sequential Recommendation

no code yet • 11 Mar 2024

Specifically, we use the original interaction sequences as the padding content and fill it to the padding positions during model training.

Paper
Add Code

Telecom Language Models: Must They Be Large?

no code yet • 7 Mar 2024

The increasing interest in Large Language Models (LLMs) within the telecommunications sector underscores their potential to revolutionize operational efficiency.

Paper
Add Code

The Claude 3 Model Family: Opus, Sonnet, Haiku

no code yet • Preprint 2024

We introduce Claude 3, a new family of large multimodal models – Claude 3 Opus, our most capable offering, Claude 3 Sonnet, which provides a combination of skills and speed, and Claude 3 Haiku, our fastest and least expensive model.

Paper
Add Code

SERVAL: Synergy Learning between Vertical Models and LLMs towards Oracle-Level Zero-shot Medical Prediction

no code yet • 3 Mar 2024

Recent development of large language models (LLMs) has exhibited impressive zero-shot proficiency on generic and common sense questions.

Paper
Add Code

Common Sense Reasoning

Benchmarks Add a Result

Libraries

Datasets

Subtasks

Latest papers with no code

Content

Benchmarks

Add a Result