The Stanford Sentiment Treebank is a corpus with fully labeled parse trees that allows for a complete analysis of the compositional effects of sentiment in language. The corpus is based on the dataset introduced by Pang and Lee (2005) and consists of 11,855 single sentences extracted from movie reviews. It was parsed with the Stanford parser and includes a total of 215,154 unique phrases from those parse trees, each annotated by 3 human judges.
2,032 PAPERS • 9 BENCHMARKS
The SST-5, also known as the Stanford Sentiment Treebank with 5 labels, is a dataset used for sentiment analysis. The SST-5 dataset consists of 11,855 single sentences extracted from movie reviews¹. It includes a total of 215,154 unique phrases from parse trees, each annotated by 3 human judges¹. Each phrase is labeled as either negative, somewhat negative, neutral, somewhat positive, or positive. This is why it's referred to as SST-5 or SST fine-grained.
288 PAPERS • 3 BENCHMARKS
The RAFT benchmark (Realworld Annotated Few-shot Tasks) focuses on naturally occurring tasks and uses an evaluation setup that mirrors deployment.
16 PAPERS • 1 BENCHMARK
A dataset specifically tailored to the biotech news sector, aiming to transcend the limitations of existing benchmarks. This dataset is rich in complex content, comprising various biotech news articles covering various events, thus providing a more nuanced view of information extraction challenges.
0 PAPER • NO BENCHMARKS YET