The GitTables-SemTab dataset is a subset of the GitTables dataset and was created to be used during the SemTab challenge. The dataset consists of 1101 tables and is used to benchmark the Column Type Annotation (CTA) task.
Its columns were annotated using semantic properties from DBpedia and semantic types and properties from Schema.org. The table below shows the number of annotated columns and number of classes used to annotate this dataset.
Columns | Classes | |
---|---|---|
Column Type Annotation - Schema.org | 721 | 59 |
Column Type Annotation - DBpedia | 2,533 | 122 |
Paper | Code | Results | Date | Stars |
---|