RoBERTa large SST

Model Name:*

Description with Markdown (optional):

# Summary

This model is trained on RoBERTa large with the binary classification setting of the Stanford Sentiment Treebank. It achieves 95.11% accuracy on the test set.

[Explore live Sentiment Analysis demo at AllenNLP](https://demo.allennlp.org/sentiment-analysis/roberta-sentiment-analysis).

## How do I load this model?

```python
from allennlp_models.pretrained import load_predictor
predictor = load_predictor("roberta-sst")
```

### Getting predictions

```python
sentence = "This film doesn't care about cleverness, wit or any other kind of intelligent humor."
preds = predictor.predict(sentence)
print(f"p(positive)={preds['probs'][0]:.2%}")
# prints: p(positive)=0.44%
```

You can also get predictions using allennlp command line interface:

```shell
echo '{"sentence": "This film doesn'\''t care about cleverness, wit or any other kind of intelligent humor."}' | \
    allennlp predict https://storage.googleapis.com/allennlp-public-models/sst-roberta-large-2020.06.08.tar.gz -
```

## How do I evaluate this model?
To evaluate the model on Stanford Sentiment Treebank run:

```shell
allennlp evaluate https://storage.googleapis.com/allennlp-public-models/sst-roberta-large-2020.06.08.tar.gz \
    https://allennlp.s3.amazonaws.com/datasets/sst/test.txt
```

## How do I train this model?

To train this model you can use `allennlp` CLI tool and the configuration file [stanford_sentiment_treebank_roberta.jsonnet](https://raw.githubusercontent.com/allenai/allennlp-models/v2.1.0/training_config/classification/stanford_sentiment_treebank_roberta.jsonnet):

```shell
allennlp train stanford_sentiment_treebank_roberta.jsonnet -s output_dir
```

See the [AllenNLP Training and prediction](https://guide.allennlp.org/training-and-prediction#2) guide for more details.

## Citation

```bibtex
@article{Liu2019RoBERTaAR,
 author = {Y. Liu and Myle Ott and Naman Goyal and Jingfei Du and Mandar Joshi and Danqi Chen and Omer Levy and M. Lewis and Luke Zettlemoyer and Veselin Stoyanov},
 journal = {ArXiv},
 title = {RoBERTa: A Robustly Optimized BERT Pretraining Approach},
 volume = {abs/1907.11692},
 year = {2019}
}
```

Paper:*

Code URL (optional):

LR	0.00002
Epochs	10
Dropout	0.1
Batch Size	32

ROBERTA

allenai / allennlp

Summary

How do I load this model?

Getting predictions

How do I evaluate this model?

How do I train this model?

Citation

Results

Sentiment Analysis on SST-2 Binary classification

Sentiment Analysis

Training Techniques	AdamW
Architecture	Dropout, Layer Normalization, Linear Layer, RoBERTa, Tanh
LR	0.00002
Epochs	10
Dropout	0.1
Batch Size	32
SHOW MORE
SHOW LESS