Grammarly’s Yahoo Answers Formality Corpus (GYAFC) is the largest dataset for any style containing a total of 110K informal / formal sentence pairs.
97 PAPERS • 3 BENCHMARKS
Touchdown is a corpus for executing navigation instructions and resolving spatial descriptions in visual real-world environments. The task is to follow instruction to a goal position and there find a hidden object, Touchdown the bear.
16 PAPERS • 1 BENCHMARK
PASTEL is a parallelly annotated stylistic language dataset. The dataset consists of ~41K parallel sentences and 8.3K parallel stories annotated across different personas.
4 PAPERS • NO BENCHMARKS YET
Gutenberg Poem Dataset is used for the next verse prediction component.
1 PAPER • NO BENCHMARKS YET