Table annotation

20 papers with code • 0 benchmarks • 10 datasets

Table annotation is the task of annotating a table with terms/concepts from knowledge graph or database schema. Table annotation is typically broken down into the following five subtasks:

Cell Entity Annotation (CEA)
Column Type Annotation (CTA)
Column Property Annotation (CPA)
Table Type Detection
Row Annotation

The SemTab challenge is closely related to the Table Annotation problem. It is a yearly challenge which focuses on the first three tasks of table annotation and its purpose is to benchmark different table annotation systems.

Benchmarks

Add a Result

These leaderboards are used to track progress in Table annotation

You can find evaluation results in the subtasks. You can also submitting evaluation metrics for this task.

Datasets

Subtasks

Table Type Detection

Row Annotation

Metric-Type Identification

Most implemented papers

Most implemented Social Latest No code

TCN: Table Convolutional Network for Web Table Interpretation

2023-MindSpore-1/ms-code-217 • • 17 Feb 2021

Existing work linearize table cells and heavily rely on modifying deep language models such as BERT which only captures related cells information in the same table.

Paper
Code

bbw: Matching CSV to Wikidata via Meta-lookup

UB-Mannheim/bbw • 1 Mar 2021

We present our publicly available semantic annotator bbw (boosted by wiki) tested at the second Semantic Web Challenge on Tabular Data to Knowledge Graph Matching (SemTab2020).

Paper
Code

Annotating Columns with Pre-trained Language Models

megagonlabs/doduo • • 5 Apr 2021

Inferring meta information about tables, such as column headers or relationships between columns, is an active research topic in data management as we find many tables are missing some of this information.

Paper
Code

MAGIC: Mining an Augmented Graph using INK, starting from a CSV

IBCNServices/Magic • SemTab@ISWC 2021

A large portion of structured data does not yet reap the benefits of the Semantic Web.

Paper
Code

JenTab Meets SemTab 2021's New Challenges

fusion-jena/jentab • SemTab@ISWC 2021

While tables are a rich source of structured information, their automated use is oftentimes prevented by the inherent ambiguity contained within.

Paper
Code

SOTAB: The WDC Schema.org Table Annotation Benchmark

wbsg-uni-mannheim/wdc-sotab • • SemTab@ISWC 2023

This paper presents the WDC Schema. org Table Annotation Benchmark (SOTAB) for comparing the performance of table annotation systems.

Paper
Code

BiodivTab: Semantic Table Annotation Benchmark Construction, Analysis, and New Additions

fusion-jena/BiodivTab • Ontology Matching@ISWC 2022 2023

Individual cells and columns are assigned to KG entities and classes to disambiguate their meaning.

Paper
Code

A large-scale dataset for end-to-end table recognition in the wild

maxkinny/tabrecset • • 27 Mar 2023

To this end, we propose a new large-scale dataset named Table Recognition Set (TabRecSet) with diverse table forms sourcing from multiple scenarios in the wild, providing complete annotation dedicated to end-to-end TR research.

Paper
Code

Column Type Annotation using ChatGPT

wbsg-uni-mannheim/tabanngpt • • TaDA@VLDB 2023

Column type annotation is the task of annotating the columns of a relational table with the semantic type of the values contained in each column.

Paper
Code

ArcheType: A Novel Framework for Open-Source Column Type Annotation using Large Language Models

penfever/archetype • • 27 Oct 2023

We introduce ArcheType, a simple, practical method for context sampling, prompt serialization, model querying, and label remapping, which enables large language models to solve CTA problems in a fully zero-shot manner.

Paper
Code

Table annotation

Benchmarks Add a Result

Datasets

Subtasks

Most implemented papers

Content

Benchmarks

Add a Result