Hansel is a human-annotated Chinese entity linking (EL) dataset, focusing on tail entities and emerging entities:
The test set contains Few-shot (FS) and zero-shot (ZS) slices, has 10K examples and uses Wikidata as the corresponding knowledge base, useful for testing Chinese/multilingual EL systems' generalization ability to tail and emerging entities.
The training and validation sets are from Wikipedia hyperlinks, useful for large-scale pretraining of Chinese EL systems.
Paper | Code | Results | Date | Stars |
---|