0

I just want to be able to see the values in my word2vec model.

I have a very small corpus. I just want to see exactly what happens in each step for this particular corpus.

A section of my code is below.

word2vec = Word2Vec(corpus, min_count=1)
word_vectors = word2vec.wv 

termsim_index = WordEmbeddingSimilarityIndex(word_vectors)


dictionary = corpora.Dictionary(food)
bow_corpus = [dictionary.doc2bow(doc) for doc in food]


similarity_matrix = SparseTermSimilarityMatrix(termsim_index, dictionary)  
docsim_index = SoftCosineSimilarity(bow_corpus, similarity_matrix, num_best=10)

So I want to see what exactly is in word_vectors,termsim_index,similarity_matrix , docsim_index

Susan-l3p
  • 157
  • 1
  • 13

1 Answers1

1

To see more of what's happening during each function, you should enable logging at the INFO level.

But then, each of your created objects have documented properties you can freely examine – either by looking at the gensim docs per class, or using generic Python operations – like those described in other SO questions, such as Is there a built-in function to print all the current properties and values of an object?.

To give more specific suggestions, you'd have to explain more what exactly you "want to see".

gojomo
  • 52,260
  • 14
  • 86
  • 115
  • I would like to see for example the values in ```termsim_index```. So from what I understand the ```WordEmbeddingSimilarityIndex``` function calculates cosine similarities. I would like to see the numbers that were calculated. – Susan-l3p Oct 10 '19 at 18:37
  • And just typing ```print(termsim_index)``` does not work. – Susan-l3p Oct 10 '19 at 18:40
  • What does "does not work" mean in this context? Did you try the options in the answer I linked, to show the properties/values/methods of the object, then drill down into those individually? Generally to understand the internals of an object like that, you would also want to review its source code, either in your local installation or in the online repo at . – gojomo Oct 10 '19 at 19:33
  • Most of the options from the link give me the following output : ```''```. I get this when I use a simple print statement too. I want to see the array with the cosine similarities. I want to know exactly what is in there. – Susan-l3p Oct 10 '19 at 20:10
  • I want to see the mapping between the entities and the vectors. – Susan-l3p Oct 10 '19 at 20:15
  • Did you try the options in the answer I linked, to show the properties/values/methods of the object, then drill down into those individually? But also, if what you really want to see is specific term-to-term similarities, the use of these `Index` classes isn't strictly necessary or even advised. You may want to use methods on the `word_vectors` object, like `most_similar()` and `similarity()`, instead. – gojomo Oct 11 '19 at 00:24
  • And finally, if you're truly curious about what's in an object, there's no getting around looking at the source code – the code for `WordEmbeddingSimilarityIndex` is barely 20 lines, & pretty much just calls the word-vectors `most_similar()`. It doesn't have any "array with the cosine similarities"! – gojomo Oct 11 '19 at 00:25
  • Aaa ok thank you. I will just go and look at the source code. – Susan-l3p Oct 11 '19 at 06:01