52 dataset results for Natural Language Understanding AND Texts

MultiWOZ-coref, (or MultiWOZ 2.3) is an extension of the MultiWOZ dataset that adds co-reference annotations in addition to corrections of dialogue acts and dialogue states.

1 PAPER • NO BENCHMARKS YET

Phrase-in-Context

Phrase in Context is a curated benchmark for phrase understanding and semantic search, consisting of three tasks of increasing difficulty: Phrase Similarity (PS), Phrase Retrieval (PR) and Phrase Sense Disambiguation (PSD). The datasets are annotated by 13 linguistic experts on Upwork and verified by two groups: ~1000 AMT crowdworkers and another set of 5 linguistic experts. PiC benchmark is distributed under CC-BY-NC 4.0.

1 PAPER • NO BENCHMARKS YET

Pre-trained Transliterated Embeddings for Indian Languages

We release various types of word embeddings for multiple Indian languages. Please note that for a majority of our work, we had transliterated the corpora to the Devanagiri script and the script is changed. Word Embedding models using FastText, ElMo, and cross-lingual models based on an orthogonal alignment of monolingual models for all pairs of these languages.

1 PAPER • NO BENCHMARKS YET

bigscience/P3

bigscience/P3 (bigscience/P3, split='ai2_arc_ARC_Challenge_pick_the_most_correct_option')

This datasets consists of challenging reasoning questions in multiple choice format.

1 PAPER • NO BENCHMARKS YET

Datasets

52 dataset results for Natural Language Understanding AND Texts