Document Summarization
195 papers with code • 7 benchmarks • 28 datasets
Automatic Document Summarization is the task of rewriting a document into its shorter form while still retaining its important content. The most popular two paradigms are extractive approaches and abstractive approaches. Extractive approaches generate summaries by extracting parts of the original document (usually sentences), while abstractive methods may generate new words or phrases which are not in the original document.
Libraries
Use these libraries to find Document Summarization models and implementationsDatasets
Most implemented papers
Scoring Sentence Singletons and Pairs for Abstractive Summarization
There is thus a crucial gap between sentence selection and fusion to support summarizing by both compressing single sentences and fusing pairs.
AREDSUM: Adaptive Redundancy-Aware Iterative Sentence Ranking for Extractive Document Summarization
Redundancy-aware extractive summarization systems score the redundancy of the sentences to be included in a summary either jointly with their salience information or separately as an additional sentence scoring step.
DebateSum: A large-scale argument mining and summarization dataset
Finally, we present a search engine for this dataset which is utilized extensively by members of the National Speech and Debate Association today.
Centroid-based Text Summarization through Compositionality of Word Embeddings
The textual similarity is a crucial aspect for many extractive text summarization methods.
TAP-DLND 1.0 : A Corpus for Document Level Novelty Detection
Detecting novelty of an entire document is an Artificial Intelligence (AI) frontier problem that has widespread NLP applications, such as extractive document summarization, tracking development of news events, predicting impact of scholarly articles, etc.
Extractive Summarization as Text Matching
This paper creates a paradigm shift with regard to the way we build neural extractive summarization systems.
Screenplay Summarization Using Latent Narrative Structure
Most general-purpose extractive summarization models are trained on news articles, which are short and present all important information upfront.
On Faithfulness and Factuality in Abstractive Summarization
It is well known that the standard likelihood training and approximate decoding objectives in neural text generation models lead to less human-like responses for open-ended tasks such as language modeling and story generation.
Leveraging Graph to Improve Abstractive Multi-Document Summarization
Graphs that capture relations between textual units have great benefits for detecting salient information from multiple documents and generating overall coherent summaries.
Pre-training via Paraphrasing
The objective noisily captures aspects of paraphrase, translation, multi-document summarization, and information retrieval, allowing for strong zero-shot performance on several tasks.