Enhanced LSTM for Natural Language Inference

Model Name:*

Description with Markdown (optional):

# Summary

This model implements the ESIM model, which is a sequential neural inference model based on chain LSTMs.

## How do I load this model?

```python
from allennlp_models.pretrained import load_predictor
predictor = load_predictor("pair-classification-esim")
```

### Getting predictions

```python
premise = "A man in a black shirt overlooking bike maintenance."
hypothesis = "A man destroys a bike."
preds = predictor.predict(premise, hypothesis)
for label, prob in zip(labels, preds["label_probs"]):
    print(f"p({label}) = {prob:.2%}")
# prints:
# p(entailment) = 1.52%
# p(contradiction) = 81.70%
# p(neutral) = 16.78%
```

You can also get predictions using allennlp command line interface:

```shell
echo '{"premise": "A man in a black shirt overlooking bike maintenance.", "hypothesis": "A man destroys a bike."}' | \
    allennlp predict https://storage.googleapis.com/allennlp-public-models/esim-elmo-2020.11.11.tar.gz -
```

## How do I evaluate this model?
To evaluate the model on Stanford Natural Language Inference (SNLI) dev set run:

```shell
allennlp evaluate https://storage.googleapis.com/allennlp-public-models/esim-elmo-2020.11.11.tar.gz \
    https://allennlp.s3.amazonaws.com/datasets/snli/snli_1.0_test.jsonl
```

## How do I train this model?

To train this model you can use `allennlp` CLI tool and the configuration file [esim.jsonnet](https://raw.githubusercontent.com/allenai/allennlp-models/v2.1.0/training_config/esim.jsonnet):

```shell
allennlp train esim.jsonnet -s output_dir
```

See the [AllenNLP Training and prediction](https://guide.allennlp.org/training-and-prediction#2) guide for more details.

## Citation

```bibtex
@inproceedings{Chen2017EnhancedLF,
 author = {Qian Chen and Xiao-Dan Zhu and Z. Ling and Si Wei and Hui Jiang and Diana Inkpen},
 booktitle = {ACL},
 title = {Enhanced LSTM for Natural Language Inference},
 year = {2017}
}
```

Paper:*

Code URL (optional):

LR	0.0004
Epochs	75
Dropout	0.5
Encoder Type	LSTM
Encoder Layers	1
Encoder Input Size	1024
Encoder Hidden Size	300
Encoder Bidirectional	True

Attached motifs:

RELU

DROPOUT

LINEAR LAYER

VARIATIONAL DROPOUT

FEEDFORWARD NETWORK

LSTM

ELMO

ESIM

CONVOLUTION

HIGHWAY LAYER

RELU

Training Techniques	Adam
Architecture	Convolution, Dropout, ELMo, ESIM, Feedforward Network, Highway Layer, LSTM, Linear Layer, ReLU, Variational Dropout
LR	0.0004
Epochs	75
Dropout	0.5
Encoder Type	LSTM
Encoder Layers	1
Encoder Input Size	1024
Encoder Hidden Size	300
Encoder Bidirectional	True
SHOW MORE
SHOW LESS

Enhanced LSTM for Natural Language Inference

allenai / allennlp

Summary

How do I load this model?

Getting predictions

How do I evaluate this model?

How do I train this model?

Citation