Krapivin

A dataset for benchmarking keyphrase extraction and generation techniques from long document English scientific papers. The dataset has high quality and consists of 2,000 scientific papers from the Computer Science domain published by ACM. Each paper has its keyphrases assigned by the authors and verified by the reviewers. Different parts of papers, such as title and abstract, are separated, enabling extraction based on the part of an article's text. The content of each paper is converted from PDF to plain text. The pieces of formulae, tables, figures and LaTeX mark up were removed automatically. Link: https://huggingface.co/datasets/midas/krapivin

Homepage

Benchmarks

Add a new result Link an existing benchmark

Trend	Task	Dataset Variant	Best Model	Paper	Code
	Keyphrase Extraction	Krapivin	PromptRank

Papers

Paper	Code	Results	Date	Stars

Dataset Loaders

Add Remove

No data loaders found. You can submit your data loader here.

Tasks

Keyphrase Extraction

Similar Datasets

SemEval-2017 Task-10

Inspec

NUS

Usage

License

Unknown

Krapivin

Benchmarks

Add a new result Link an existing benchmark

Papers

Dataset Loaders

Add Remove

Tasks

Similar Datasets

SemEval-2017 Task-10

Inspec

NUS

Usage

License

Modalities

Languages

Krapivin

Benchmarks Edit Add a new result Link an existing benchmark

Papers

Dataset Loaders Edit Add Remove

Tasks Edit

Similar Datasets

SemEval-2017 Task-10

Inspec

NUS

Usage

License Edit

Modalities Edit

Languages Edit

Benchmarks

Add a new result Link an existing benchmark

Dataset Loaders

Add Remove

Tasks

License

Modalities

Languages