The T2Dv2 dataset consists of 779 tables originating from the English-language subset of the WebTables corpus. 237 tables are annotated for the Table Type Detection task, 236 for the Columns Property Annotation (CPA) task and 235 for the Row Annotation task. The annotations that are used are DBpedia types, properties and entities.
13 PAPERS • 4 BENCHMARKS
Click to add a brief description of the dataset (Markdown and LaTeX enabled).
2 PAPERS • NO BENCHMARKS YET
We present TNCR, a new table dataset with varying image quality collected from free open source websites. TNCR dataset can be used for table detection in scanned document images and their classification into 5 different classes.