Recent Developments in Word Embeddings

The most commonly used models are word2vec and GloVe which are both unsupervised approaches based on the distributional hypothesis (words that occur in the same contexts tend to have similar meanings).
While several works augment these unsupervised approaches by incorporating the supervision of semantic or syntactic knowledge, purely unsupervised approaches have seen interesting developments in 2017–2018, the most notable being FastText (an extension of word2vec) and ELMo (state-of-the-art contextual word vectors).
FastText was developed by the team of Tomas Mikolov who proposed the word2vec framework in 2013, triggering the explosion of research on universal word embeddings.
Main improvement of FastText over the original word2vec vectors is the inclusion of character n-grams, which allows computing word representations for words that did not appear in the training data (“out-of-vocabulary” words).
FastText vectors are super-fast to train and are available in 157 languages trained on Wikipedia and Crawl. They are a great baseline.

Provide feedback