DRI Corpus (Dr. Inventor Multi-layer Scientific Corpus)

Introduced by Fisas et al. in A Multi-Layered Annotated Corpus of Scientific Papers

The Dr. Inventor Multi-Layer Scientific Corpus (DRI Corpus) includes 40 Computer Graphics papers, selected by domain experts. Each paper of the Corpus has been annotated by three annotators by providing the following layers of annotations, each one characterizing a core aspect of scientific publications:

  • Scientific discourse: each sentence has been associated to a specific scientific discourse category (Background, Approach, Challenge, Future Work, etc.).
  • Subjective statements and novelty: each sentence has been characterized with respect to advantages, disadvantages and novel aspects presented.
  • Citation purpose: to each citation has been associated a purpose specifying the reason why the authors of the paper cited the specific piece of research.
  • Summary relevance of sentences and hand written summaries: each sentence of the paper has been characterized by an integer score ranging from 1 to 5, to point out the relevance of the same sentence for its inclusion in the summary of the paper. Sentences rated as 5 are the most relevant ones to summarize a paper. For each paper three hand-written summaries (max 250 words) are provided.
