Data-to-Text Generation

105 papers with code • 24 benchmarks • 22 datasets

A classic problem in natural-language generation (NLG) involves taking structured data, such as a table, as input, and producing text that adequately and fluently describes this data as output. Unlike machine translation, which aims for complete transduction of the sentence to be translated, this form of NLG is usually taken to require addressing (at least) two separate challenges: what to say, the selection of an appropriate subset of the input data to discuss, and how to say it, the surface realization of a generation.

( Image credit: Data-to-Text Generation with Content Selection and Planning )

Benchmarks

Add a Result

These leaderboards are used to track progress in Data-to-Text Generation

Dataset	Best Model	Compare
WebNLG	Control Prefixes (A1, T5-large)	See all
E2E NLG Challenge	S_1^R	See all
WebNLG Full	Control Prefixes (A1, A2, T5-large)	See all
Cleaned E2E NLG Challenge	Control Prefixes (T5-large)	See all
RotoWire (Relation Generation)	SeqPlan	See all
RotoWire	HierarchicalEncoder + NR + IR	See all
ToTTo	T5-3B	See all
XAlign	Fact-aware embedding with mT5	See all
Rotowire (Content Selection)	Hierarchical Transformer Encoder + conditional copy	See all
RotoWire (Content Ordering)	Hierarchical Transformer Encoder + conditional copy	See all
MULTIWOZ 2.1	T5-Base	See all
MLB Dataset (Relation Generation)	SeqPlan	See all
MLB Dataset	SeqPlan	See all
MLB Dataset (Content Ordering)	SeqPlan	See all
Czech Restaurant NLG	binmt	See all
MLB Dataset (Content Selection)	Force-Copy	See all
SR11Deep	Transition based Deep Input Linearization	See all
ViGGO	DataTuner_FC	See all
WebNLG en	mBART	See all
WebNLG ru	mBART	See all
E2E	self-mem + new data (random)	See all
AMR3.0	StructAdapt	See all
Wikipedia Person and Animal Dataset	Ours	See all
DART	self-mem + new data	See all

Show all 24 benchmarks

Collapse benchmarks

Libraries

Use these libraries to find Data-to-Text Generation models and implementations

UFAL-DSG/tgen

2 papers

204

Datasets

Subtasks

Most implemented papers

Most implemented Social Latest No code

Automated learning of templates for data-to-text generation: comparing rule-based, statistical and neural methods

TallChris91/Automated-Template-Learning • WS 2018

The current study investigated novel techniques and methods for trainable approaches to data-to-text generation.

Paper
Code

E2E NLG Challenge: Neural Models vs. Templates

UKPLab/e2e-nlg-challenge-2017 • • WS 2018

E2E NLG Challenge is a shared task on generating restaurant descriptions from sets of key-value pairs.

Paper
Code

Data-to-Text Generation with Style Imitation

ha-lins/DTG-SI • • Findings of the Association for Computational Linguistics 2020

That is, the model learns to imitate the writing style of any given exemplar sentence, with automatic adaptions to faithfully describe the content record.

Paper
Code

Step-by-Step: Separating Planning from Realization in Neural Data-to-Text Generation

AmitMY/chimera • NAACL 2019

We propose to split the generation process into a symbolic text-planning stage that is faithful to the input, followed by a neural generation stage that focuses only on realization.

Paper
Code

Copy mechanism and tailored training for character-based data-to-text generation

marco-roberti/char-data-to-text-gen • • 26 Apr 2019

In the last few years, many different methods have been focusing on using deep recurrent neural networks for natural language generation.

Paper
Code

Creating a Corpus for Russian Data-to-Text Generation Using Neural Machine Translation and Post-Editing

shimorina/bsnlp-2019 • • WS 2019

In this paper, we propose an approach for semi-automatically creating a data-to-text (D2T) corpus for Russian that can be used to learn a D2T natural language generation model.

Paper
Code

Neural data-to-text generation: A comparison between pipeline and end-to-end architectures

ThiagoCF05/webnlg • IJCNLP 2019

In contrast, recent neural models for data-to-text generation have been proposed as end-to-end approaches, where the non-linguistic input is rendered in natural language with much less explicit intermediate representations in-between.

Paper
Code

Enhancing AMR-to-Text Generation with Dual Graph Representations

UKPLab/emnlp2019-dualgraph • • IJCNLP 2019

Generating text from graph-based data, such as Abstract Meaning Representation (AMR), is a challenging task due to the inherent difficulty in how to properly encode the structure of a graph with labeled edges.

Paper
Code

Improving Quality and Efficiency in Plan-based Neural Data-to-Text Generation

AmitMY/chimera • WS 2019

We follow the step-by-step approach to neural data-to-text generation we proposed in Moryossef et al (2019), in which the generation process is divided into a text-planning stage followed by a plan-realization stage.

Paper
Code

Template-free Data-to-Text Generation of Finnish Sports News

scoopmatic/finnish-hockey-news-generation-paper • WS (NoDaLiDa) 2019

News articles such as sports game reports are often thought to closely follow the underlying game statistics, but in practice they contain a notable amount of background knowledge, interpretation, insight into the game, and quotes that are not present in the official statistics.

Paper
Code

Data-to-Text Generation

Benchmarks Add a Result

Libraries

Datasets

Subtasks

Most implemented papers

Content

Benchmarks

Add a Result