CodRED is the first human-annotated cross-document relation extraction (RE) dataset, aiming to test the RE systems’ ability of knowledge acquisition in the wild. CodRED has the following features:
it requires natural language understanding in different granularity, including coarse-grained document retrieval, as well as fine-grained cross-document multi-hop reasoning;
it contains 30,504 relational facts associated with 210,812 reasoning text paths, as well as enjoys a broad range of balanced relations, and long documents in diverse topics;
it provides strong supervision about the reasoning text paths for predicting the relation, to help guide RE systems to perform meaningful and interpretable reasoning;
Paper | Code | Results | Date | Stars |
---|