Gensim-LSI-Word-Similarities

Two simple little functions to create word-word similarities from Gensim's latent semantic indexing in Python. Both functions produce an inverted cosine similarity score (0 = low, 1 = high) between two words in a Gensim-generated LSA/LSI space across the total number of dimensions specified in the creation of the model (i.e., num_topics from gensim.models.LsiModel).

Both require Gensim, Pandas, and SciPy.

Includes four functions:

wordsim: Create cosine-derived similarity score (from 0-1) between individual words. Input:

word1 (string or string variable)
word2 (string or string variable)
target_dictionary (Gensim-created LSI dictionary)
target_lsi_model (Gensim-created LSI model)

wordvectsim: Same as wordvect but created to calculate similarity scores (from 0-1) for word pairs in a 2-dimensional word vector (e.g., using numpy.apply_along_axis). Input:

word_vector2d (2D string vector or 2D string vector variable)
target_dictionary (Gensim-created LSI dictionary)
target_lsi_model (Gensim-created LSI model)

Two additional functions/series of functions added (detailed documentation available in each function and will be added here soon):

word2vec_vect_sim_fun: similarity score function for gensim's word2vec
word_pair_similarity_matrix: word-word similarity matrix function for gensim's LSI (LSA) model

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
README.md		README.md
license		license
word2vec_vect_sim_fun.py		word2vec_vect_sim_fun.py
word_pair_similarity_matrix.py		word_pair_similarity_matrix.py
wordsim_fun.py		wordsim_fun.py
wordvectsim_fun.py		wordvectsim_fun.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Gensim-LSI-Word-Similarities

About

Releases

Packages

Languages

License

a-paxton/Gensim-LSI-Word-Similarities

Folders and files

Latest commit

History

Repository files navigation

Gensim-LSI-Word-Similarities

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages