4 dataset results for Zero-shot Named Entity Recognition (NER)

WikiEvents is a document-level event extraction benchmark dataset which includes complete event and coreference annotation.

27 PAPERS • 2 BENCHMARKS

CrossNER

CrossNER is a cross-domain NER (Named Entity Recognition) dataset, a fully-labeled collection of NER data spanning over five diverse domains (Politics, Natural Science, Music, Literature, and Artificial Intelligence) with specialized entity categories for different domains. Additionally, CrossNER also includes unlabeled domain-related corpora for the corresponding five domains.

12 PAPERS • 1 BENCHMARK

Broad Twitter Corpus

This paper introduces the Broad Twitter Corpus (BTC), which is not only significantly bigger, but sampled across different regions, temporal periods, and types of Twitter users. The gold-standard named entity annotations are made by a combination of NLP experts and crowd workers, which enables us to harness crowd recall while maintaining high quality. We also measure the entity drift observed in our dataset (i.e. how entity representation varies over time), and compare to newswire.

11 PAPERS • 2 BENCHMARKS

HarveyNER

fine-grained location names extraction from disaster-related tweets

3 PAPERS • 1 BENCHMARK

Datasets

4 dataset results for Zero-shot Named Entity Recognition (NER)