Data-to-Text Generation

105 papers with code • 24 benchmarks • 22 datasets

A classic problem in natural-language generation (NLG) involves taking structured data, such as a table, as input, and producing text that adequately and fluently describes this data as output. Unlike machine translation, which aims for complete transduction of the sentence to be translated, this form of NLG is usually taken to require addressing (at least) two separate challenges: what to say, the selection of an appropriate subset of the input data to discuss, and how to say it, the surface realization of a generation.

( Image credit: Data-to-Text Generation with Content Selection and Planning )

Benchmarks

Add a Result

These leaderboards are used to track progress in Data-to-Text Generation

Dataset	Best Model	Compare
WebNLG	Control Prefixes (A1, T5-large)	See all
E2E NLG Challenge	S_1^R	See all
WebNLG Full	Control Prefixes (A1, A2, T5-large)	See all
Cleaned E2E NLG Challenge	Control Prefixes (T5-large)	See all
RotoWire (Relation Generation)	SeqPlan	See all
RotoWire	HierarchicalEncoder + NR + IR	See all
ToTTo	T5-3B	See all
XAlign	Fact-aware embedding with mT5	See all
Rotowire (Content Selection)	Hierarchical Transformer Encoder + conditional copy	See all
RotoWire (Content Ordering)	Hierarchical Transformer Encoder + conditional copy	See all
MULTIWOZ 2.1	T5-Base	See all
MLB Dataset (Relation Generation)	SeqPlan	See all
MLB Dataset	SeqPlan	See all
MLB Dataset (Content Ordering)	SeqPlan	See all
Czech Restaurant NLG	binmt	See all
MLB Dataset (Content Selection)	Force-Copy	See all
SR11Deep	Transition based Deep Input Linearization	See all
ViGGO	DataTuner_FC	See all
WebNLG en	mBART	See all
WebNLG ru	mBART	See all
E2E	self-mem + new data (random)	See all
AMR3.0	StructAdapt	See all
Wikipedia Person and Animal Dataset	Ours	See all
DART	self-mem + new data	See all

Show all 24 benchmarks

Collapse benchmarks

Libraries

Use these libraries to find Data-to-Text Generation models and implementations

UFAL-DSG/tgen

2 papers

204

Datasets

Subtasks

Most implemented papers

Most implemented Social Latest No code

CoNT: Contrastive Neural Text Generation

google-research-datasets/ToTTo • 29 May 2022

We validate CoNT on five generation tasks with ten benchmarks, including machine translation, summarization, code comment generation, data-to-text generation and commonsense generation.

Paper
Code

Triples-to-isiXhosa (T2X): Addressing the Challenges of Low-Resource Agglutinative Data-to-Text Generation

francois-meyer/t2x • 12 Mar 2024

In this paper we tackle data-to-text for isiXhosa, which is low-resource and agglutinative.

Paper
Code

What to talk about and how? Selective Generation using LSTMs with Coarse-to-Fine Alignment

HMEIatJHU/SelectiveGeneration • NAACL 2016

We propose an end-to-end, domain-independent neural encoder-aligner-decoder model for selective generation, i. e., the joint task of content selection and surface realization.

Paper
Code

The Code2Text Challenge: Text Generation in Source Code Libraries

yakazimir/Code-Datasets • 31 Jul 2017

We propose a new shared task for tactical data-to-text generation in the domain of source code libraries.

Paper
Code

A surprisingly effective out-of-the-box char2char model on the E2E NLG Challenge dataset

shubhamagarwal92/sigdialSubmission • • WS 2017

We train a char2char model on the E2E NLG Challenge data, by exploiting {``}out-of-the-box{''} the recently released tfseq2seq framework, using some of the standard options offered by this tool.

Paper
Code

Bootstrapping Generators from Noisy Data

EdinburghNLP/wikigen • NAACL 2018

A core step in statistical data-to-text generation concerns learning correspondences between structured data representations (e. g., facts in a database) and associated texts.

Paper
Code

Describing a Knowledge Base

EagleW/Describing_a_Knowledge_Base • • WS 2018

We aim to automatically generate natural language descriptions about an input structured knowledge base (KB).

Paper
Code

Operations Guided Neural Networks for High Fidelity Data-To-Text Generation

janenie/espn-nba-data • 8 Sep 2018

Even though the generated texts are mostly fluent and informative, they often generate descriptions that are not consistent with the input structured data.

Paper
Code

Findings of the E2E NLG Challenge

UFAL-DSG/tgen • • WS 2018

This paper summarises the experimental setup and results of the first shared task on end-to-end (E2E) natural language generation (NLG) in spoken dialogue systems.

Paper
Code

End-to-End Content and Plan Selection for Data-to-Text Generation

sebastianGehrmann/diverse_ensembling • • WS 2018

Learning to generate fluent natural language from structured data with neural networks has become an common approach for NLG.

Paper
Code

Data-to-Text Generation

Benchmarks Add a Result

Libraries

Datasets

Subtasks

Most implemented papers

Content

Benchmarks

Add a Result