CoLA (Corpus of Linguistic Acceptability)

Introduced by Warstadt et al. in Neural Network Acceptability Judgments

The Corpus of Linguistic Acceptability (CoLA) consists of 10657 sentences from 23 linguistics publications, expertly annotated for acceptability (grammaticality) by their original authors. The public version contains 9594 sentences belonging to training and development sets, and excludes 1063 sentences belonging to a held out test set.

Source: https://nyu-mll.github.io/CoLA/

Homepage