fastText

Introduced by Bojanowski et al. in Enriching Word Vectors with Subword Information

fastText embeddings exploit subword information to construct word embeddings. Representations are learnt of character $n$-grams, and words represented as the sum of the $n$-gram vectors. This extends the word2vec type models with subword information. This helps the embeddings understand suffixes and prefixes. Once a word is represented using character $n$-grams, a skipgram model is trained to learn the embeddings.

Source: Enriching Word Vectors with Subword Information

Read Paper See Code

Papers

Paper	Code	Results	Date	Stars

Tasks

Task	Papers	Share
Text Classification	33	7.91%
General Classification	28	6.71%
Sentence	24	5.76%
Sentiment Analysis	22	5.28%
Classification	16	3.84%
Named Entity Recognition (NER)	15	3.60%
Language Modelling	12	2.88%
Word Similarity	11	2.64%
Clustering	7	1.68%

Usage Over Time

This feature is experimental; we are continuously improving our matching algorithm.

Components

Component	Type	Add Remove
🤖 No Components Found	You can add them if they exist; e.g. Mask R-CNN uses RoIAlign

Categories

Add Remove

Word Embeddings

Static Word Embeddings