Control Prefixes for Parameter-Efficient Text Generation

15 Oct 2021  ยท  Jordan Clive, Kris Cao, Marek Rei ยท

Prefix-tuning is a powerful lightweight technique for adapting a large pre-trained language model to a downstream application. However, it uses the same dataset-level tuned prompt for all examples in the dataset. We extend this idea and propose a dynamic method, Control Prefixes, which allows for the inclusion of conditional input-dependent information, combining the benefits of prompt tuning and controlled generation. The method incorporates attribute-level learnable representations into different layers of a pre-trained transformer, allowing for the generated text to be guided in a particular direction. We provide a systematic evaluation of the technique and apply it to five datasets from the GEM benchmark for natural language generation (NLG). Although the aim is to develop a parameter-efficient model, we show Control Prefixes can even outperform full fine-tuning methods. We present state-of-the-art results on several data-to-text datasets, including WebNLG.

PDF Abstract

Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Uses Extra
Training Data
Result Benchmark
Text Simplification ASSET Control Prefixes (BART) SARI (EASSE>=0.2.1) 43.58 # 3
FKGL 5.97 # 1
QuestEval (Reference-less, BERTScore) 0.64 # 1
Data-to-Text Generation Cleaned E2E NLG Challenge Control Prefixes (T5-large) BLEU (Test set) 44.15 # 1
Text Generation DART Control Prefixes (T5-large) METEOR 0.411 # 1
Text Simplification TurkCorpus Control Prefixes (BART) SARI (EASSE>=0.2.1) 42.32 # 3
FKGL 7.74 # 2
QuestEval (Reference-less, BERTScore) 0.66 # 1
Data-to-Text Generation WebNLG Control Prefixes (A1, T5-large) BLEU 67.32 # 1
Data-to-Text Generation WebNLG Control Prefixes (A1, A2, T5-large) BLEU 67.15 # 2
Data-to-Text Generation WebNLG Full Control Prefixes (A1, T5-large) BLEU 61.94 # 2
Data-to-Text Generation WebNLG Full Control Prefixes (A1, A2, T5-large) BLEU 62.27 # 1

Methods


No methods listed for this paper. Add relevant methods here