Natural Language Inference

729 papers with code • 43 benchmarks • 77 datasets

Natural language inference (NLI) is the task of determining whether a "hypothesis" is true (entailment), false (contradiction), or undetermined (neutral) given a "premise".

Example:

Premise Label Hypothesis
A man inspects the uniform of a figure in some East Asian country. contradiction The man is sleeping.
An older and younger man smiling. neutral Two men are smiling and laughing at the cats playing on the floor.
A soccer game with multiple males playing. entailment Some men are playing a sport.

Approaches used for NLI include earlier symbolic and statistical approaches to more recent deep learning approaches. Benchmark datasets used for NLI include SNLI, MultiNLI, SciTail, among others. You can get hands-on practice on the SNLI task by following this d2l.ai chapter.

Further readings:

Libraries

Use these libraries to find Natural Language Inference models and implementations
14 papers
124,593
5 papers
2,198
4 papers
2,548
4 papers
228
See all 17 libraries.

TLDR at SemEval-2024 Task 2: T5-generated clinical-Language summaries for DeBERTa Report Analysis

shahriarnz14/tldr-t5-generated-clinical-language-for-deberta-report-analysis 14 Apr 2024

This paper introduces novel methodologies for the Natural Language Inference for Clinical Trials (NLI4CT) task.

0
14 Apr 2024

XNLIeu: a dataset for cross-lingual NLI in Basque

faceonlive/ai-research 10 Apr 2024

We have conducted a series of experiments using mono- and multilingual LLMs to assess a) the effect of professional post-edition on the MT system; b) the best cross-lingual strategy for NLI in Basque; and c) whether the choice of the best cross-lingual strategy is influenced by the fact that the dataset is built by translation.

131
10 Apr 2024

IITK at SemEval-2024 Task 2: Exploring the Capabilities of LLMs for Safe Biomedical Natural Language Inference for Clinical Trials

exploration-lab/iitk-semeval-2024-task-2-clinical-nli 6 Apr 2024

Large Language models (LLMs) have demonstrated state-of-the-art performance in various natural language processing (NLP) tasks across multiple domains, yet they are prone to shortcut learning and factual inconsistencies.

0
06 Apr 2024

Forget NLI, Use a Dictionary: Zero-Shot Topic Classification for Low-Resource Languages with Application to Luxembourgish

faceonlive/ai-research 5 Apr 2024

A common method for ZSC is to fine-tune a language model on a Natural Language Inference (NLI) dataset and then use it to infer the entailment between the input document and the target labels.

131
05 Apr 2024

Investigating the Robustness of Modelling Decisions for Few-Shot Cross-Topic Stance Detection: A Preregistered Study

faceonlive/ai-research 5 Apr 2024

In this paper, we investigate the robustness of operationalization choices for few-shot stance detection, with special attention to modelling stance across different topics.

131
05 Apr 2024

Evaluating Generative Language Models in Information Extraction as Subjective Question Correction

thu-keg/sqc-score 4 Apr 2024

(1) The imprecision of existing evaluation metrics that struggle to effectively gauge semantic consistency between model outputs and ground truth, and (2) The inherent incompleteness of evaluation benchmarks, primarily due to restrictive human annotation schemas, resulting in underestimated LLM performances.

2
04 Apr 2024

Affective-NLI: Towards Accurate and Interpretable Personality Recognition in Conversation

preke/affective-nli 3 Apr 2024

To utilize affectivity within dialog content for accurate personality recognition, we fine-tuned a pre-trained language model specifically for emotion recognition in conversations, facilitating real-time affective annotations for utterances.

2
03 Apr 2024

On the Role of Summary Content Units in Text Summarization Evaluation

tristanratz/scu-text-evaluation 2 Apr 2024

At the heart of the Pyramid evaluation method for text summarization lie human written summary content units (SCUs).

5
02 Apr 2024

AILS-NTUA at SemEval-2024 Task 6: Efficient model tuning for hallucination detection and analysis

ngregoriade/semeval2024-shroom 1 Apr 2024

In this paper, we present our team's submissions for SemEval-2024 Task-6 - SHROOM, a Shared-task on Hallucinations and Related Observable Overgeneration Mistakes.

1
01 Apr 2024

Unveiling Divergent Inductive Biases of LLMs on Temporal Data

sindhukrao/llm_temporal_bias 1 Apr 2024

Unraveling the intricate details of events in natural language necessitates a subtle understanding of temporal dynamics.

1
01 Apr 2024