GitTables-SemTab

The GitTables-SemTab dataset is a subset of the GitTables dataset and was created to be used during the SemTab challenge. The dataset consists of 1101 tables and is used to benchmark the Column Type Annotation (CTA) task.

Its columns were annotated using semantic properties from DBpedia and semantic types and properties from Schema.org. The table below shows the number of annotated columns and number of classes used to annotate this dataset.

Columns Classes
Column Type Annotation - Schema.org 721 59
Column Type Annotation - DBpedia 2,533 122

Papers


Paper Code Results Date Stars

Dataset Loaders


No data loaders found. You can submit your data loader here.

Tasks


Similar Datasets


License


Modalities


Languages