Table annotation

20 papers with code • 0 benchmarks • 10 datasets

Table annotation is the task of annotating a table with terms/concepts from knowledge graph or database schema. Table annotation is typically broken down into the following five subtasks:

  1. Cell Entity Annotation (CEA)
  2. Column Type Annotation (CTA)
  3. Column Property Annotation (CPA)
  4. Table Type Detection
  5. Row Annotation

The SemTab challenge is closely related to the Table Annotation problem. It is a yearly challenge which focuses on the first three tasks of table annotation and its purpose is to benchmark different table annotation systems.

Most implemented papers

TCN: Table Convolutional Network for Web Table Interpretation

2023-MindSpore-1/ms-code-217 17 Feb 2021

Existing work linearize table cells and heavily rely on modifying deep language models such as BERT which only captures related cells information in the same table.

bbw: Matching CSV to Wikidata via Meta-lookup

UB-Mannheim/bbw 1 Mar 2021

We present our publicly available semantic annotator bbw (boosted by wiki) tested at the second Semantic Web Challenge on Tabular Data to Knowledge Graph Matching (SemTab2020).

Annotating Columns with Pre-trained Language Models

megagonlabs/doduo 5 Apr 2021

Inferring meta information about tables, such as column headers or relationships between columns, is an active research topic in data management as we find many tables are missing some of this information.

MAGIC: Mining an Augmented Graph using INK, starting from a CSV

IBCNServices/Magic SemTab@ISWC 2021

A large portion of structured data does not yet reap the benefits of the Semantic Web.

JenTab Meets SemTab 2021's New Challenges

fusion-jena/jentab SemTab@ISWC 2021

While tables are a rich source of structured information, their automated use is oftentimes prevented by the inherent ambiguity contained within.

SOTAB: The WDC Schema.org Table Annotation Benchmark

wbsg-uni-mannheim/wdc-sotab SemTab@ISWC 2023

This paper presents the WDC Schema. org Table Annotation Benchmark (SOTAB) for comparing the performance of table annotation systems.

A large-scale dataset for end-to-end table recognition in the wild

maxkinny/tabrecset 27 Mar 2023

To this end, we propose a new large-scale dataset named Table Recognition Set (TabRecSet) with diverse table forms sourcing from multiple scenarios in the wild, providing complete annotation dedicated to end-to-end TR research.

Column Type Annotation using ChatGPT

wbsg-uni-mannheim/tabanngpt TaDA@VLDB 2023

Column type annotation is the task of annotating the columns of a relational table with the semantic type of the values contained in each column.

ArcheType: A Novel Framework for Open-Source Column Type Annotation using Large Language Models

penfever/archetype 27 Oct 2023

We introduce ArcheType, a simple, practical method for context sampling, prompt serialization, model querying, and label remapping, which enables large language models to solve CTA problems in a fully zero-shot manner.