no code implementations • 8 Nov 2023 • Urban Knupleš, Diego Frassinelli, Sabine Schulte im Walde
Humans tend to strongly agree on ratings on a scale for extreme cases (e. g., a CAT is judged as very concrete), but judgements on mid-scale words exhibit more disagreement.