Data-to-Text Generation

105 papers with code • 24 benchmarks • 22 datasets

A classic problem in natural-language generation (NLG) involves taking structured data, such as a table, as input, and producing text that adequately and fluently describes this data as output. Unlike machine translation, which aims for complete transduction of the sentence to be translated, this form of NLG is usually taken to require addressing (at least) two separate challenges: what to say, the selection of an appropriate subset of the input data to discuss, and how to say it, the surface realization of a generation.

( Image credit: Data-to-Text Generation with Content Selection and Planning )

Libraries

Use these libraries to find Data-to-Text Generation models and implementations
2 papers
204

Latest papers with no code

Decoder-Only or Encoder-Decoder? Interpreting Language Model as a Regularized Encoder-Decoder

no code yet • 8 Apr 2023

Grounded on our analysis, we propose a novel partial attention language model to solve the attention degeneration problem.

MURMUR: Modular Multi-Step Reasoning for Semi-Structured Data-to-Text Generation

no code yet • 16 Dec 2022

We propose MURMUR, a neuro-symbolic modular approach to text generation from semi-structured data with multi-step reasoning.

Grounded Keys-to-Text Generation: Towards Factual Open-Ended Generation

no code yet • 4 Dec 2022

We propose a new grounded keys-to-text generation task: the task is to generate a factual description about an entity given a set of guiding keys, and grounding passages.

Time-aware Prompting for Text Generation

no code yet • 3 Nov 2022

Despite having less performance drop when testing on data drawn from a later time, linear prompts focus more on non-temporal information and are less sensitive to the given timestamps, according to human evaluations and sensitivity analyses.

Mapping Process for the Task: Wikidata Statements to Text as Wikipedia Sentences

no code yet • 23 Oct 2022

Acknowledged as one of the most successful online cooperative projects in human society, Wikipedia has obtained rapid growth in recent years and desires continuously to expand content and disseminate knowledge values for everyone globally.

Table-To-Text generation and pre-training with TabT5

no code yet • 17 Oct 2022

Encoder-only transformer models have been successfully applied to different table understanding tasks, as in TAPAS (Herzig et al., 2020).

Comparing Computational Architectures for Automated Journalism

no code yet • 8 Oct 2022

The majority of NLG systems have been designed following either a template-based or a pipeline-based architecture.

Calibrating Sequence likelihood Improves Conditional Language Generation

no code yet • 30 Sep 2022

Conditional language models are predominantly trained with maximum likelihood estimation (MLE), giving probability mass to sparsely observed target sequences.

XF2T: Cross-lingual Fact-to-Text Generation for Low-Resource Languages

no code yet • 22 Sep 2022

Our extensive experiments show that a multi-lingual mT5 model which uses fact-aware embeddings with structure-aware input encoding leads to best results on average across the twelve languages.

High Recall Data-to-text Generation with Progressive Edit

no code yet • 9 Aug 2022

We observed that when the same target sentence was repeated twice, Transformer (T5) based model generates an output made up of asymmetric sentences from structured inputs.