TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Data-to-Text Generation	DART	self-mem + new data	BLEU	47.76	# 1
Data-to-Text Generation	E2E	self-mem + new data (random)	METEOR	46.11	# 1
Data-to-Text Generation	E2E	self-mem + new data (fixed)	METEOR	46.07	# 2
Data-to-Text Generation	E2E NLG Challenge	Self-memory	BLEU	65.11	# 9
Data-to-Text Generation	E2E NLG Challenge	Self-memory	NIST	8.35	# 7
Data-to-Text Generation	E2E NLG Challenge	Self-memory	METEOR	46.11	# 1
Data-to-Text Generation	E2E NLG Challenge	Self-memory	ROUGE-L	68.41	# 6
Data-to-Text Generation	E2E NLG Challenge	Self-memory	CIDEr	2.16	# 7

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/self-training-from-self-memory-in-data-to/data-to-text-generation-on-dart)](https://paperswithcode.com/sota/data-to-text-generation-on-dart?p=self-training-from-self-memory-in-data-to)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/self-training-from-self-memory-in-data-to/data-to-text-generation-on-e2e)](https://paperswithcode.com/sota/data-to-text-generation-on-e2e?p=self-training-from-self-memory-in-data-to)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/self-training-from-self-memory-in-data-to/data-to-text-generation-on-e2e-nlg-challenge)](https://paperswithcode.com/sota/data-to-text-generation-on-e2e-nlg-challenge?p=self-training-from-self-memory-in-data-to)`

Self-training from Self-memory in Data-to-text Generation

19 Jan 2024 · Hoang-Thang Ta ·

This paper introduces a novel training model, self-training from self-memory (STSM) in data-to-text generation (DTG), allowing the model to self-train on subsets, including self-memory as outputs inferred directly from the trained models and/or the new data. The quality of self-memory is validated by two models, data-to-text (D2T) and text-to-data (T2D), by two pre-defined conditions: (1) the appearance of all source values in the outputs of the D2T model and (2) the ability to convert back to source data in the outputs in the T2D model. We utilize a greedy algorithm to generate shorter D2T outputs if they contain all source values. Subsequently, we use the T2D model to confirm that these outputs can capture input relationships by demonstrating their capacity to convert text back into data. With 30% of the dataset, we can train the D2T model with a competitive performance compared to full training in the same setup. We experiment with our model on two datasets, E2E NLG and DART. STSM offers the D2T model a generalization capability from its subset memory while reducing training data volume. Ultimately, we anticipate that this paper will contribute to continual learning solutions that adapt to new training data, incorporating it as a form of self-memory in DTG tasks. The curated dataset is publicly available at: https://github.com/hoangthangta/STSM.

PDF Abstract

Code

Add Remove Mark official

hoangthangta/stsm official

Tasks

Add Remove

Continual Learning

Data-to-Text Generation

Text Generation

Datasets

E2E DART

Results from the Paper

Add Remove

Ranked #1 on Data-to-Text Generation on E2E

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Data-to-Text Generation	DART	self-mem + new data	BLEU	47.76	# 1	Compare
Data-to-Text Generation	E2E	self-mem + new data (random)	METEOR	46.11	# 1	Compare
Data-to-Text Generation	E2E	self-mem + new data (fixed)	METEOR	46.07	# 2	Compare
Data-to-Text Generation	E2E NLG Challenge	Self-memory	BLEU	65.11	# 9	Compare
			NIST	8.35	# 7	Compare
			METEOR	46.11	# 1	Compare
			ROUGE-L	68.41	# 6	Compare
			CIDEr	2.16	# 7	Compare

Methods

Add Remove

self-mem + new data

Edit Social Preview

Self-training from Self-memory in Data-to-text Generation

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit Add Remove

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Add Remove

Methods

Add Remove