TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Table Retrieval	Statcan Dialogue Dataset	DPR (retrieving basic info + member items)	Recall@10	46.2	# 1
Table Retrieval	Statcan Dialogue Dataset	TAPAS (retrieving truncated table)	Recall@10	22.1	# 5
Table Retrieval	Statcan Dialogue Dataset	TAPAS-NQ (retrieving truncated table)	Recall@10	30.0	# 4
Table Retrieval	Statcan Dialogue Dataset	DPR (retrieving basic info)	Recall@10	45.0	# 2
Table Retrieval	Statcan Dialogue Dataset	DPR (retrieving title)	Recall@10	43.8	# 3

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/the-statcan-dialogue-dataset-retrieving-data/table-retrieval-on-statcan-dialogue-dataset)](https://paperswithcode.com/sota/table-retrieval-on-statcan-dialogue-dataset?p=the-statcan-dialogue-dataset-retrieving-data)`

The StatCan Dialogue Dataset: Retrieving Data Tables through Conversations with Genuine Intents

3 Apr 2023 · Xing Han Lu, Siva Reddy, Harm de Vries ·

We introduce the StatCan Dialogue Dataset consisting of 19,379 conversation turns between agents working at Statistics Canada and online users looking for published data tables. The conversations stem from genuine intents, are held in English or French, and lead to agents retrieving one of over 5000 complex data tables. Based on this dataset, we propose two tasks: (1) automatic retrieval of relevant tables based on a on-going conversation, and (2) automatic generation of appropriate agent responses at each turn. We investigate the difficulty of each task by establishing strong baselines. Our experiments on a temporal data split reveal that all models struggle to generalize to future conversations, as we observe a significant drop in performance across both tasks when we move from the validation to the test set. In addition, we find that response generation models struggle to decide when to return a table. Considering that the tasks pose significant challenges to existing models, we encourage the community to develop models for our task, which can be directly used to help knowledge workers find relevant tables for live chat users.

PDF Abstract

Code

Add Remove Mark official

McGill-NLP/statcan-dialogue-dataset official

Tasks

Add Remove

Dialogue Generation

Retrieval

Table Retrieval

Datasets

Introduced in the Paper:

Statcan Dialogue Dataset

Used in the Paper:

test

Results from the Paper

Edit

Ranked #1 on Table Retrieval on Statcan Dialogue Dataset

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Table Retrieval	Statcan Dialogue Dataset	DPR (retrieving basic info + member items)	Recall@10	46.2	# 1	Compare
Table Retrieval	Statcan Dialogue Dataset	TAPAS (retrieving truncated table)	Recall@10	22.1	# 5	Compare
Table Retrieval	Statcan Dialogue Dataset	TAPAS-NQ (retrieving truncated table)	Recall@10	30.0	# 4	Compare
Table Retrieval	Statcan Dialogue Dataset	DPR (retrieving basic info)	Recall@10	45.0	# 2	Compare
Table Retrieval	Statcan Dialogue Dataset	DPR (retrieving title)	Recall@10	43.8	# 3	Compare

Methods

Add Remove

Adafactor • Attention Dropout • BPE • Dense Connections • Dropout • GELU • GLU • Inverse Square Root Schedule • Layer Normalization • Linear Layer • Multi-Head Attention • Residual Connection • Scaled Dot-Product Attention • SentencePiece • Softmax • T5

Edit Social Preview

The StatCan Dialogue Dataset: Retrieving Data Tables through Conversations with Genuine Intents

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove