We release E-commerce Dialogue Corpus, comprising a training data set, a development set and a test set for retrieval based chatbot. The statistics of E-commerical Conversation Corpus are shown in the following table.
37 PAPERS • 1 BENCHMARK
The DSTC7 Task 1 dataset is a dataset and task for goal-oriented dialogue. The data originates from human-human conversations, which is built from online resources, specifically the Ubuntu Internet Relay Chat (IRC) channel and an Advising dataset from the University of Michigan.
11 PAPERS • 1 BENCHMARK
Reddit Corpus is part of a repository of conversational datasets consisting of hundreds of millions of examples, and a standardised evaluation procedure for conversational response selection models using '1-of-100 accuracy'. The Reddit Corpus contains 726 million multi-turn dialogues from the Reddit board.
7 PAPERS • 1 BENCHMARK
Advising Corpus is a dataset based on an entirely new collection of dialogues in which university students are being advised which classes to take. These were collected at the University of Michigan with IRB approval. They were released as part of DSTC 7 track 1 and used again in DSTC 8 track 2.
4 PAPERS • 1 BENCHMARK
This dataset is for evaluating the task of Black-box Multi-agent Integration which focuses on combining the capabilities of multiple black-box conversational agents at scale. It provides data to explore two main frameworks of exploration: question agent pairing and question response pairing.
1 PAPER • 1 BENCHMARK