0

During every search request this function is used and takes 2 seconds of time:

def get_model(config:dict):
    return gensim.models.Word2Vec.load(config['path_model']).wv

Is it possible to open it one time, save it in cache or redis and after that use it again to speed up the search?

I have already tried this option:

model = get_model(config)
pickled_object = pickle.dumps(model)
key = 'model_object'
redis_instance.set(key, pickled_object)

And than I get it:

 t0 = time.time()
 get_object = redis_instance.get('model_object')
 t1 = time.time()
 total = t1-t0
 print('Get object', total)
 t0 = time.time()
 model = pickle.loads(get_object)
 t1 = time.time()
 total = t1-t0
 print('Pickle loads', total)

but time of processing didn't change.

Mar 30 09:47:54 ip-172-31-25-211 gunicorn[27354]: Get object 1.2066564559936523
Mar 30 09:47:54 ip-172-31-25-211 gunicorn[27354]: Pickle loads 0.854522705078125
  • It might be relevant for your case to check the Redis Module https://redisai.io. It supports storing model and running them on Redis, reducing the load time. – Guy Korland Mar 31 '21 at 12:49

1 Answers1

0

You're correct that you want to keep the model in memory, so loading-cost isn't redundantly paid every request.

But swizzling it into Redis isn't the best way to do that.

See instead this older answer for relevant ideas:

How to speed up Gensim Word2vec model load time?

But note things get easier in the just-released Gensim-4.0.0 - you don't need to do the init_sims or syn0norm patching. Simply loading it with mmap='r' may be enough, and the usual similarity-operations won't use as much extra memory or steps.

gojomo
  • 52,260
  • 14
  • 86
  • 115