Sentence Embeddings

95 papers with code • 0 benchmarks • 11 datasets

This task has no description! Would you like to contribute one?

Greatest papers with code

Toward Better Storylines with Sentence-Level Language Models

google-research/google-research ACL 2020

We propose a sentence-level language model which selects the next sentence in a story from a finite set of fluent alternatives.

Language Modelling Sentence Embeddings +1

Making Monolingual Sentence Embeddings Multilingual using Knowledge Distillation

UKPLab/sentence-transformers EMNLP 2020

The training is based on the idea that a translated sentence should be mapped to the same location in the vector space as the original sentence.

Knowledge Distillation Sentence Embedding

Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks

UKPLab/sentence-transformers IJCNLP 2019

However, it requires that both sentences are fed into the network, which causes a massive computational overhead: Finding the most similar pair in a collection of 10, 000 sentences requires about 50 million inference computations (~65 hours) with BERT.

Semantic Similarity Semantic Textual Similarity +2

WikiMatrix: Mining 135M Parallel Sentences in 1620 Language Pairs from Wikipedia

facebookresearch/LASER EACL 2021

We present an approach based on multilingual sentence embeddings to automatically extract parallel sentences from the content of Wikipedia articles in 85 languages, including several dialects or low-resource languages.

Sentence Embeddings

Massively Multilingual Sentence Embeddings for Zero-Shot Cross-Lingual Transfer and Beyond

facebookresearch/LASER TACL 2019

We introduce an architecture to learn joint multilingual sentence representations for 93 languages, belonging to more than 30 different families and written in 28 different scripts.

Cross-Lingual Bitext Mining Cross-Lingual Document Classification +5

Margin-based Parallel Corpus Mining with Multilingual Sentence Embeddings

facebookresearch/LASER ACL 2019

Machine translation is highly sensitive to the size and quality of the training data, which has led to an increasing interest in collecting and filtering large parallel corpora.

Cross-Lingual Bitext Mining Machine Translation +2

What you can cram into a single vector: Probing sentence embeddings for linguistic properties

facebookresearch/InferSent 3 May 2018

Although much effort has recently been devoted to training high-quality sentence embeddings, we still have a poor understanding of what they are capturing.

General Classification Sentence Classification +1

Universal Sentence Encoder

facebookresearch/InferSent 29 Mar 2018

For both variants, we investigate and report the relationship between model complexity, resource consumption, the availability of transfer task training data, and task performance.

Conversational Response Selection Semantic Textual Similarity +6

DisSent: Sentence Representation Learning from Explicit Discourse Relations

facebookresearch/InferSent 12 Oct 2017

Learning effective representations of sentences is one of the core missions of natural language understanding.

Dependency Parsing Natural Language Understanding +1