Neural Abstractive Unsupervised Summarization of Online News Discussions

7 Jun 2021  ·  Ignacio Tampe Palma, Marcelo Mendoza, Evangelos Milios ·

Summarization has usually relied on gold standard summaries to train extractive or abstractive models. Social media brings a hurdle to summarization techniques since it requires addressing a multi-document multi-author approach. We address this challenging task by introducing a novel method that generates abstractive summaries of online news discussions. Our method extends a BERT-based architecture, including an attention encoding that fed comments' likes during the training stage. To train our model, we define a task which consists of reconstructing high impact comments based on popularity (likes). Accordingly, our model learns to summarize online discussions based on their most relevant comments. Our novel approach provides a summary that represents the most relevant aspects of a news item that users comment on, incorporating the social context as a source of information to summarize texts in online social networks. Our model is evaluated using ROUGE scores between the generated summary and each comment on the thread. Our model, including the social attention encoding, significantly outperforms both extractive and abstractive summarization methods based on such evaluation.

PDF Abstract

Datasets


Introduced in the Paper:

Emol news articles and comments
Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Unsupervised Text Summarization Emol news articles dataset Title-Single comment prediction XENT(Rogue) 0.358 # 1
Unsupervised Text Summarization Emol news articles dataset Title-Triple comment prediction with attention XENT(Rogue) 0.153 # 2

Methods