1

I trained a word2vec model using gensim and I want to randomly select vectors from it, and find the corresponding word. What is the best what to do so?

oren_isp
  • 729
  • 1
  • 7
  • 22

2 Answers2

1

If your Word2Vec model instance is in the variable model, then there's a list of all words known to the model in model.wv.index2word. (The properties are slightly different in older versions of gensim.)

So, you can pick one item using Python's built-in choice() method in the random module:

import random
print(random.choice(model.wv.index2entity) 
gojomo
  • 52,260
  • 14
  • 86
  • 115
  • 1
    your code return a key - there is no way to directly randomly select a vector? – oren_isp Aug 14 '18 at 06:50
  • 1
    Look up that key: `print(model.wv[random.choice(model.wv.index2entity])`. If what you really want is a random vector in that coordinate space, *not* one of the known word-vectors exactly, see some other SO answer like perhaps: https://stackoverflow.com/questions/5408276/sampling-uniformly-distributed-random-points-inside-a-spherical-volume/23785326 – gojomo Aug 14 '18 at 18:04
0

If you want to get n random words (keys) from word2vec with Gensim 4.0.0 just use random.sample:

import random
import gensim
# Here we use Gensim 4.0.0
w2v = gensim.models.KeyedVectors.load_word2vec_format("model.300d")
# Get 10 random words (keys) from word2vec model
random_words = random.sample(w2v.index_to_key, 10)
print("Random words: "+ str(random_words))

Piece a cake :)

Kyrylo Malakhov
  • 1,256
  • 12
  • 13