A Step-by-Step Gradient Penalty with Similarity Calculation for Text Summary Generation

Neural Processing Letters 2022  ยท  Shuai Zhao ยท

The summary generation model equipped with gradient penalty avoids overfitting and makes the model more stable. However, the traditional gradient penalty faces two issues: (i) calculating the gradient twice increases training time, and (ii) the disturbance factor requires repeated trials to find the best value. To this end, we propose a step-by-step gradient penalty model with similarity calculation (S2SGP). Firstly, the step-by-step gradient penalty is applied to the summary generation model, effectively reducing the training time without sacrificing accuracy. Secondly, the similarity score between reference and candidate summary is calculated as disturbance factor. To show the performance of our proposed solution, we conduct experiments on four summary generation datasets, among which the EDUSum dataset is newly produced by us. Experimental results show that S2SGP effectively reduces training time, and the disturbance factors do not rely on repeated trials. Especially, our model outperforms the baseline by more than 2.4 ROUGE-L points when tested on the CSL dataset.

PDF

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Benchmark
Abstractive Text Summarization EDUsum Seq2seq ROUGE-1 48.62 # 1
ROUGE-2 32.32 # 1
ROUGE-L 44.13 # 1
Abstractive Text Summarization EDUsum BERT ROUGE-1 62.37 # 2
ROUGE-2 50.70 # 2
ROUGE-L 59.40 # 2
Abstractive Text Summarization EDUsum RoBERTa ROUGE-1 63.22 # 3
ROUGE-2 51.34 # 3
ROUGE-L 60.26 # 3
Abstractive Text Summarization EDUsum NEZHA ROUGE-1 63.91 # 4
ROUGE-2 51.88 # 4
ROUGE-L 61.00 # 4
Abstractive Text Summarization EDUsum GP_Step_Sim ROUGE-1 64.48 # 5
ROUGE-2 52.70 # 5
ROUGE-L 61.91 # 5

Methods