TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Column Type Annotation	VizNet-Sato-Full	Sato	Macro-F1	75.6	# 3
Column Type Annotation	VizNet-Sato-Full	Sato	Weighted-F1	90.2	# 2
Column Type Annotation	VizNet-Sato-MultiColumn	Sato	Weighted-F1	92.5	# 1
Column Type Annotation	VizNet-Sato-MultiColumn	Sato	Macro-F1	73.5	# 2

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/sato-contextual-semantic-type-detection-in/column-type-annotation-on-viznet-sato-1)](https://paperswithcode.com/sota/column-type-annotation-on-viznet-sato-1?p=sato-contextual-semantic-type-detection-in)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/sato-contextual-semantic-type-detection-in/column-type-annotation-on-viznet-sato-full)](https://paperswithcode.com/sota/column-type-annotation-on-viznet-sato-full?p=sato-contextual-semantic-type-detection-in)`

Sato: Contextual Semantic Type Detection in Tables

14 Nov 2019 · Dan Zhang, Yoshihiko Suhara, Jinfeng Li, Madelon Hulsebos, Çağatay Demiralp, Wang-Chiew Tan ·

Detecting the semantic types of data columns in relational tables is important for various data preparation and information retrieval tasks such as data cleaning, schema matching, data discovery, and semantic search. However, existing detection approaches either perform poorly with dirty data, support only a limited number of semantic types, fail to incorporate the table context of columns or rely on large sample sizes for training data. We introduce Sato, a hybrid machine learning model to automatically detect the semantic types of columns in tables, exploiting the signals from the context as well as the column values. Sato combines a deep learning model trained on a large-scale table corpus with topic modeling and structured prediction to achieve support-weighted and macro average F1 scores of 0.925 and 0.735, respectively, exceeding the state-of-the-art performance by a significant margin. We extensively analyze the overall and per-type performance of Sato, discussing how individual modeling components, as well as feature categories, contribute to its performance.

PDF Abstract

Code

Add Remove Mark official

megagonlabs/sato official

108

Tasks

Add Remove

Column Type Annotation

Information Retrieval

Retrieval

Structured Prediction

Vocal Bursts Type Prediction

Datasets

Introduced in the Paper:

VizNet-Sato

Used in the Paper:

DBpedia

Results from the Paper

Edit

Ranked #2 on Column Type Annotation on VizNet-Sato-MultiColumn

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Column Type Annotation	VizNet-Sato-Full	Sato	Macro-F1	75.6	# 3	Compare
Column Type Annotation	VizNet-Sato-Full	Sato	Weighted-F1	90.2	# 2	Compare
Column Type Annotation	VizNet-Sato-MultiColumn	Sato	Weighted-F1	92.5	# 1	Compare
Column Type Annotation	VizNet-Sato-MultiColumn	Sato	Macro-F1	73.5	# 2	Compare

Methods

Add Remove

No methods listed for this paper. Add relevant methods here

Edit Social Preview

Sato: Contextual Semantic Type Detection in Tables

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove