Semantic Annotation of Tabular Data for Machine-to-Machine Interoperability via Neuro-Symbolic Anchoring

SemTab@ISWC 2023  ·  Shervin Mehryar, Remzi Celebi ·

In this paper we investigate automated annotation of tabular data using semantic technologies in combination with neural network embedding. Specifically, we propose an anchoring model in which property and cell types from the data embedding space are aligned with ontology relation and entity types. We show that by combining the power of symbolic reasoning, neural embeddings, and loss function design, a significant performance improvement as high as 86% for column property, 82% for column type, and 87% for column qualifier annotations can be achieved based on DBpedia and Wikidata table extractions.

PDF

Datasets


Task Dataset Model Metric Name Metric Value Global Rank Benchmark
Columns Property Annotation WDC SOTAB V2 MUT2KG Micro F1 79.35 # 2
Column Type Annotation WDC SOTAB V2 MUT2KG Micro F1 32.01 # 5

Methods