Data-to-Text Generation

105 papers with code • 24 benchmarks • 22 datasets

A classic problem in natural-language generation (NLG) involves taking structured data, such as a table, as input, and producing text that adequately and fluently describes this data as output. Unlike machine translation, which aims for complete transduction of the sentence to be translated, this form of NLG is usually taken to require addressing (at least) two separate challenges: what to say, the selection of an appropriate subset of the input data to discuss, and how to say it, the surface realization of a generation.

( Image credit: Data-to-Text Generation with Content Selection and Planning )

Benchmarks

Add a Result

These leaderboards are used to track progress in Data-to-Text Generation

Dataset	Best Model	Compare
WebNLG	Control Prefixes (A1, T5-large)	See all
E2E NLG Challenge	S_1^R	See all
WebNLG Full	Control Prefixes (A1, A2, T5-large)	See all
Cleaned E2E NLG Challenge	Control Prefixes (T5-large)	See all
RotoWire (Relation Generation)	SeqPlan	See all
RotoWire	HierarchicalEncoder + NR + IR	See all
ToTTo	T5-3B	See all
XAlign	Fact-aware embedding with mT5	See all
Rotowire (Content Selection)	Hierarchical Transformer Encoder + conditional copy	See all
RotoWire (Content Ordering)	Hierarchical Transformer Encoder + conditional copy	See all
MULTIWOZ 2.1	T5-Base	See all
MLB Dataset (Relation Generation)	SeqPlan	See all
MLB Dataset	SeqPlan	See all
MLB Dataset (Content Ordering)	SeqPlan	See all
Czech Restaurant NLG	binmt	See all
MLB Dataset (Content Selection)	Force-Copy	See all
SR11Deep	Transition based Deep Input Linearization	See all
ViGGO	DataTuner_FC	See all
WebNLG en	mBART	See all
WebNLG ru	mBART	See all
E2E	self-mem + new data (random)	See all
AMR3.0	StructAdapt	See all
Wikipedia Person and Animal Dataset	Ours	See all
DART	self-mem + new data	See all

Show all 24 benchmarks

Collapse benchmarks

Libraries

Use these libraries to find Data-to-Text Generation models and implementations

UFAL-DSG/tgen

2 papers

204

Datasets

Subtasks

Latest papers with no code

Most implemented Social Latest No code

Decoder-Only or Encoder-Decoder? Interpreting Language Model as a Regularized Encoder-Decoder

no code yet • 8 Apr 2023

Grounded on our analysis, we propose a novel partial attention language model to solve the attention degeneration problem.

Paper
Add Code

MURMUR: Modular Multi-Step Reasoning for Semi-Structured Data-to-Text Generation

no code yet • 16 Dec 2022

We propose MURMUR, a neuro-symbolic modular approach to text generation from semi-structured data with multi-step reasoning.

Paper
Add Code

Grounded Keys-to-Text Generation: Towards Factual Open-Ended Generation

no code yet • 4 Dec 2022

We propose a new grounded keys-to-text generation task: the task is to generate a factual description about an entity given a set of guiding keys, and grounding passages.

Paper
Add Code

Time-aware Prompting for Text Generation

no code yet • 3 Nov 2022

Despite having less performance drop when testing on data drawn from a later time, linear prompts focus more on non-temporal information and are less sensitive to the given timestamps, according to human evaluations and sensitivity analyses.

Paper
Add Code

Mapping Process for the Task: Wikidata Statements to Text as Wikipedia Sentences

no code yet • 23 Oct 2022

Acknowledged as one of the most successful online cooperative projects in human society, Wikipedia has obtained rapid growth in recent years and desires continuously to expand content and disseminate knowledge values for everyone globally.

Paper
Add Code

Table-To-Text generation and pre-training with TabT5

no code yet • 17 Oct 2022

Encoder-only transformer models have been successfully applied to different table understanding tasks, as in TAPAS (Herzig et al., 2020).

Paper
Add Code

Comparing Computational Architectures for Automated Journalism

no code yet • 8 Oct 2022

The majority of NLG systems have been designed following either a template-based or a pipeline-based architecture.

Paper
Add Code

Calibrating Sequence likelihood Improves Conditional Language Generation

no code yet • 30 Sep 2022

Conditional language models are predominantly trained with maximum likelihood estimation (MLE), giving probability mass to sparsely observed target sequences.

Paper
Add Code

XF2T: Cross-lingual Fact-to-Text Generation for Low-Resource Languages

no code yet • 22 Sep 2022

Our extensive experiments show that a multi-lingual mT5 model which uses fact-aware embeddings with structure-aware input encoding leads to best results on average across the twelve languages.

Paper
Add Code

High Recall Data-to-text Generation with Progressive Edit

no code yet • 9 Aug 2022

We observed that when the same target sentence was repeated twice, Transformer (T5) based model generates an output made up of asymmetric sentences from structured inputs.

Paper
Add Code

Data-to-Text Generation

Benchmarks Add a Result

Libraries

Datasets

Subtasks

Latest papers with no code

Content

Benchmarks

Add a Result