Data-to-Text Generation

107 papers with code • 24 benchmarks • 22 datasets

A classic problem in natural-language generation (NLG) involves taking structured data, such as a table, as input, and producing text that adequately and fluently describes this data as output. Unlike machine translation, which aims for complete transduction of the sentence to be translated, this form of NLG is usually taken to require addressing (at least) two separate challenges: what to say, the selection of an appropriate subset of the input data to discuss, and how to say it, the surface realization of a generation.

( Image credit: Data-to-Text Generation with Content Selection and Planning )

Libraries

Use these libraries to find Data-to-Text Generation models and implementations
2 papers
204

Most implemented papers

Transition-Based Deep Input Linearization

SUTDNLP/ZGen EACL 2017

Traditional methods for deep NLG adopt pipeline approaches comprising stages such as constructing syntactic input, predicting function words, linearizing the syntactic input and generating the surface forms.

Semantic Noise Matters for Neural Natural Language Generation

tuetschek/e2e-cleaning WS 2019

Neural natural language generation (NNLG) systems are known for their pathological outputs, i. e. generating text which is unrelated to the input specification.

A Hierarchical Model for Data-to-Text Generation

KaijuML/data-to-text-hierarchical 20 Dec 2019

This however loses most of the structure contained in the data.

Revisiting Challenges in Data-to-Text Generation with Fact Grounding

wanghm92/rw_fg WS 2019

Data-to-text generation models face challenges in ensuring data fidelity by referring to the correct input source.

Modeling Global and Local Node Contexts for Text Generation from Knowledge Graphs

UKPLab/kg2text 29 Jan 2020

Recent graph-to-text models generate text from graph-based data using either global or local aggregation to learn node representations.

Variational Template Machine for Data-to-Text Generation

ReneeYe/VariationalTemplateMachine ICLR 2020

We propose the variational template machine (VTM), a novel method to generate text descriptions from data tables.

Have Your Text and Use It Too! End-to-End Neural Data-to-Text Generation with Semantic Fidelity

amazon-research/datatuner COLING 2020

Our generated text has a significantly better semantic fidelity than the state of the art across all four datasets

ToTTo: A Controlled Table-To-Text Generation Dataset

google-research-datasets/ToTTo EMNLP 2020

We present ToTTo, an open-domain English table-to-text dataset with over 120, 000 training examples that proposes a controlled generation task: given a Wikipedia table and a set of highlighted table cells, produce a one-sentence description.

GPT-too: A language-model-first approach for AMR-to-text generation

IBM/GPT-too-AMR2text ACL 2020

Meaning Representations (AMRs) are broad-coverage sentence-level semantic graphs.

Partially-Aligned Data-to-Text Generation with Distant Supervision

fuzihaofzh/distant_supervision_nlg EMNLP 2020

This kind of data is much easier to obtain since it can be produced automatically.