TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Text Summarization	Arxiv HEP-TH citation graph	FactorSum	ROUGE-1	49.32	# 5
Text Summarization	Arxiv HEP-TH citation graph	FactorSum	ROUGE-2	20.27	# 9
Text Summarization	Arxiv HEP-TH citation graph	FactorSum	ROUGE-L	44.76	# 2
Text Summarization	GovReport	FactorSum	ROUGE-1	60.1	# 1
Text Summarization	GovReport	FactorSum	ROUGE-2	25.28	# 1
Text Summarization	GovReport	FactorSum	ROUGE-L	56.65	# 1
Text Summarization	Pubmed	FactorSum	ROUGE-1	47.5	# 11
Text Summarization	Pubmed	FactorSum	ROUGE-2	20.33	# 14
Text Summarization	Pubmed	FactorSum	ROUGE-L	43.76	# 7

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/factorizing-content-and-budget-decisions-in/text-summarization-on-govreport)](https://paperswithcode.com/sota/text-summarization-on-govreport?p=factorizing-content-and-budget-decisions-in)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/factorizing-content-and-budget-decisions-in/text-summarization-on-arxiv)](https://paperswithcode.com/sota/text-summarization-on-arxiv?p=factorizing-content-and-budget-decisions-in)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/factorizing-content-and-budget-decisions-in/text-summarization-on-pubmed-1)](https://paperswithcode.com/sota/text-summarization-on-pubmed-1?p=factorizing-content-and-budget-decisions-in)`

Factorizing Content and Budget Decisions in Abstractive Summarization of Long Documents

25 May 2022 · Marcio Fonseca, Yftah Ziser, Shay B. Cohen ·

We argue that disentangling content selection from the budget used to cover salient content improves the performance and applicability of abstractive summarizers. Our method, FactorSum, does this disentanglement by factorizing summarization into two steps through an energy function: (1) generation of abstractive summary views; (2) combination of these views into a final summary, following a budget and content guidance. This guidance may come from different sources, including from an advisor model such as BART or BigBird, or in oracle mode -- from the reference. This factorization achieves significantly higher ROUGE scores on multiple benchmarks for long document summarization, namely PubMed, arXiv, and GovReport. Most notably, our model is effective for domain adaptation. When trained only on PubMed samples, it achieves a 46.29 ROUGE-1 score on arXiv, which indicates a strong performance due to more flexible budget adaptation and content selection less dependent on domain-specific textual structure.

PDF Abstract

Code

Add Remove Mark official

thefonseca/factorsum official

↳ Quickstart in

Colab

Tasks

Add Remove

Abstractive Text Summarization

Disentanglement

Document Summarization

Domain Adaptation

Text Summarization

Datasets

Pubmed GovReport Arxiv HEP-TH citation graph

Results from the Paper

Edit

Ranked #1 on Text Summarization on GovReport

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Text Summarization	Arxiv HEP-TH citation graph	FactorSum	ROUGE-1	49.32	# 5	Compare
			ROUGE-2	20.27	# 9	Compare
			ROUGE-L	44.76	# 2	Compare
Text Summarization	GovReport	FactorSum	ROUGE-1	60.1	# 1	Compare
			ROUGE-2	25.28	# 1	Compare
			ROUGE-L	56.65	# 1	Compare
Text Summarization	Pubmed	FactorSum	ROUGE-1	47.5	# 11	Compare
			ROUGE-2	20.33	# 14	Compare
			ROUGE-L	43.76	# 7	Compare

Methods

Add Remove

Adam • BART • BigBird • BPE • Dense Connections • Dropout • GELU • Layer Normalization • Linear Layer • Multi-Head Attention • Residual Connection • Scaled Dot-Product Attention • Softmax

Edit Social Preview

Factorizing Content and Budget Decisions in Abstractive Summarization of Long Documents

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove