A Dataset for Relation Extraction of Natural-Products (A curated evaluation dataset for end-to-end Relation Extraction of relationships between organisms and natural-products)

Introduced by Delmas et al. in Relation Extraction in underexplored biomedical domains: A diversity-optimised sampling and synthetic data generation approach

A curated evaluation dataset for end-to-end Relation Extraction of relationships between organisms and natural-products.

Details about the manual annotation:

  • For Chemicals:

    • The chemical labels are annotated as they appear in the abstract.

    • In abstracts, singular chemicals and classes of chemicals produced by a specific organism were distinguished.

    • The "type" attribute {“chemical”, “class”} is used to indicate the nature of the mentioned name.

    • A "class" attribute for chemical entities has also been included if class information is present in the abstract.

    • A Wikidata and PubChem identifiers were assigned to chemicals and classes when available.

  • For Organisms:

    • The organism labels are annotated as they appear in the abstract.

    • If in an abstract, the genus name was mention first, e.g. "Plakinastrella sp." and then the specie name e.g "Plakinastrella clathrata" is precise, then only the specie name is used.

    • A Wikidata identifier was assigned to all organisms.

    • In some abstracts, only the genus name is mentioned.

  • For Relations:

    • Only the relations explicitly mentioned in the abstract are reported in the output labels.

    • Relations are reported in their order of appearance in the abstract.

Papers


Paper Code Results Date Stars

Dataset Loaders


No data loaders found. You can submit your data loader here.

Tasks