Extending Multi-Text Sentence Fusion Resources via Pyramid Annotations

ACL ARR January 2022  ·  Anonymous ·

NLP models that process multiple texts often struggle in recognizing corresponding and salient information that is often differently phrased, and consolidating the redundancies across texts. To facilitate research of such challenges, the sentence fusion task was proposed, yet previous datasets for this task were very limited in their size and scope. In this paper, we revisit and substantially extend previous dataset creation efforts. With careful modifications, relabeling, and employing complementing data sources, we were able to more than triple the size of a notable earlier dataset.Moreover, we show that our extended version uses more representative texts for multi-document tasks and provides a more diverse training set, which substantially improves model performance.

PDF Abstract
No code implementations yet. Submit your code now

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here