Text Summarization
369 papers with code • 33 benchmarks • 88 datasets
Text Summarization is a natural language processing (NLP) task that involves condensing a lengthy text document into a shorter, more compact version while still retaining the most important information and meaning. The goal is to produce a summary that accurately represents the content of the original text in a concise form.
There are different approaches to text summarization, including extractive methods that identify and extract important sentences or phrases from the text, and abstractive methods that generate new text based on the content of the original text.
Libraries
Use these libraries to find Text Summarization models and implementationsDatasets
Subtasks
Most implemented papers
Evaluating the Factual Consistency of Abstractive Text Summarization
Currently used metrics for assessing summarization algorithms do not account for whether summaries are factually consistent with source documents.
ProphetNet: Predicting Future N-gram for Sequence-to-Sequence Pre-training
This paper presents a new sequence-to-sequence pre-training model called ProphetNet, which introduces a novel self-supervised objective named future n-gram prediction and the proposed n-stream self-attention mechanism.
ERNIE-GEN: An Enhanced Multi-Flow Pre-training and Fine-tuning Framework for Natural Language Generation
Current pre-training works in natural language generation pay little attention to the problem of exposure bias on downstream tasks.
BARThez: a Skilled Pretrained French Sequence-to-Sequence Model
We show BARThez to be very competitive with state-of-the-art BERT-based French language models such as CamemBERT and FlauBERT.
PanGu-$α$: Large-scale Autoregressive Pretrained Chinese Language Models with Auto-parallel Computation
To enhance the generalization ability of PanGu-$\alpha$, we collect 1. 1TB high-quality Chinese data from a wide range of domains to pretrain the model.
OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework
In this work, we pursue a unified paradigm for multimodal pretraining to break the scaffolds of complex task/modality-specific customization.
LCSTS: A Large Scale Chinese Short Text Summarization Dataset
Automatic text summarization is widely regarded as the highly difficult problem, partially because of the lack of large text summarization data set.
A Regularized Framework for Sparse and Structured Neural Attention
Modern neural networks are often augmented with an attention mechanism, which tells the network where to focus within the input.
Data-driven Summarization of Scientific Articles
Data-driven approaches to sequence-to-sequence modelling have been successfully applied to short text summarization of news articles.
Deep Reinforcement Learning For Sequence to Sequence Models
In this survey, we consider seq2seq problems from the RL point of view and provide a formulation combining the power of RL methods in decision-making with sequence-to-sequence models that enable remembering long-term memories.