Hansel is a human-annotated Chinese entity linking (EL) dataset, focusing on tail entities and emerging entities:

  • The test set contains Few-shot (FS) and zero-shot (ZS) slices, has 10K examples and uses Wikidata as the corresponding knowledge base, useful for testing Chinese/multilingual EL systems' generalization ability to tail and emerging entities.

  • The training and validation sets are from Wikipedia hyperlinks, useful for large-scale pretraining of Chinese EL systems.

Papers


Paper Code Results Date Stars

Dataset Loaders


Tasks


Similar Datasets


License


Modalities


Languages