Data-to-Text Generation

105 papers with code • 24 benchmarks • 22 datasets

A classic problem in natural-language generation (NLG) involves taking structured data, such as a table, as input, and producing text that adequately and fluently describes this data as output. Unlike machine translation, which aims for complete transduction of the sentence to be translated, this form of NLG is usually taken to require addressing (at least) two separate challenges: what to say, the selection of an appropriate subset of the input data to discuss, and how to say it, the surface realization of a generation.

( Image credit: Data-to-Text Generation with Content Selection and Planning )

Libraries

Use these libraries to find Data-to-Text Generation models and implementations
2 papers
204

Most implemented papers

CoNT: Contrastive Neural Text Generation

google-research-datasets/ToTTo 29 May 2022

We validate CoNT on five generation tasks with ten benchmarks, including machine translation, summarization, code comment generation, data-to-text generation and commonsense generation.

Triples-to-isiXhosa (T2X): Addressing the Challenges of Low-Resource Agglutinative Data-to-Text Generation

francois-meyer/t2x 12 Mar 2024

In this paper we tackle data-to-text for isiXhosa, which is low-resource and agglutinative.

What to talk about and how? Selective Generation using LSTMs with Coarse-to-Fine Alignment

HMEIatJHU/SelectiveGeneration NAACL 2016

We propose an end-to-end, domain-independent neural encoder-aligner-decoder model for selective generation, i. e., the joint task of content selection and surface realization.

The Code2Text Challenge: Text Generation in Source Code Libraries

yakazimir/Code-Datasets 31 Jul 2017

We propose a new shared task for tactical data-to-text generation in the domain of source code libraries.

A surprisingly effective out-of-the-box char2char model on the E2E NLG Challenge dataset

shubhamagarwal92/sigdialSubmission WS 2017

We train a char2char model on the E2E NLG Challenge data, by exploiting {``}out-of-the-box{''} the recently released tfseq2seq framework, using some of the standard options offered by this tool.

Bootstrapping Generators from Noisy Data

EdinburghNLP/wikigen NAACL 2018

A core step in statistical data-to-text generation concerns learning correspondences between structured data representations (e. g., facts in a database) and associated texts.

Describing a Knowledge Base

EagleW/Describing_a_Knowledge_Base WS 2018

We aim to automatically generate natural language descriptions about an input structured knowledge base (KB).

Operations Guided Neural Networks for High Fidelity Data-To-Text Generation

janenie/espn-nba-data 8 Sep 2018

Even though the generated texts are mostly fluent and informative, they often generate descriptions that are not consistent with the input structured data.

Findings of the E2E NLG Challenge

UFAL-DSG/tgen WS 2018

This paper summarises the experimental setup and results of the first shared task on end-to-end (E2E) natural language generation (NLG) in spoken dialogue systems.

End-to-End Content and Plan Selection for Data-to-Text Generation

sebastianGehrmann/diverse_ensembling WS 2018

Learning to generate fluent natural language from structured data with neural networks has become an common approach for NLG.