Reddit Conversation Corpus

Introduced by Dziri et al. in Augmenting Neural Response Generation with Context-Aware Topical Attention

Reddit Conversation Corpus (RCC) consists of conversations, scraped from Reddit, for a 20 month period from November 2016 until August 2018. To ensure the quality and diversity of topics, 95 subreddits are selected from which conversations are collected. In total, RCC contains 9.2 million 3-turn conversations.

Papers


Paper Code Results Date Stars

Dataset Loaders


No data loaders found. You can submit your data loader here.

Tasks


Similar Datasets


License


Modalities


Languages