PSC (Polish Summaries Corpus)

Introduced by Ogrodniczuk et al. in The Polish Summaries Corpus

The Polish Summaries Corpus is a resource created to support the development and evaluation of tools for automated single-document summarization of Polish. The Corpus contains a large number of manual summaries of news articles, with many independently created summaries for a single text. This approach is designed to overcome the annotator bias, which is often described as a problem during the evaluation of summarization algorithms against a single gold standard.

The corpus includes both abstract free-word summaries and extraction-based summaries created by selecting text spans from the original document. It can be used not only for the evaluation of existing summarization tools but also for studies on the human summarization process in the Polish language.

The corpus was co-funded by the ATLAS project and by the European Union from the resources of the European Social Fund. The texts to summarize were extracted from a specific source and are currently available on terms stated on that corpus webpage. There is also a Java API to the corpus.

Papers


Paper Code Results Date Stars

Dataset Loaders


No data loaders found. You can submit your data loader here.

Tasks


Similar Datasets


License


  • Unknown

Modalities


Languages