I have an issue similar to the one discussed here - gensim word2vec - updating word embeddings with newcoming data
I have the following code that saves a model as text8_gensim.bin
sentences = word2vec.Text8Corpus('train/text8')
model = word2vec.Word2Vec(sentences, size=200, workers=12, min_count=5,sg=0, window=8, iter=15, sample=1e-4,alpha=0.05,cbow_mean=1,negative=25)
model.save("./savedModel/text8_gensim.bin")
Here is the code that adds more data to the saved model (after loading it)
fname="savedModel/text8_gensim.bin"
model = word2vec.Word2Vec.load(fname)
model.epochs=15
#Custom words
docs = ["start date", "end date", "eft date","termination date"]
model.build_vocab(docs, update=True)
model.train(docs, total_examples=model.corpus_count, epochs=model.epochs)
model.wv.similarity('start','eft')
The model loads fine; however when I try to call model.wv.similarity function I get the following error
KeyError: "word 'eft' not in vocabulary"
Am I missing something here?