How to find similar sentence from a corpus on word2vec?

Question

I have implemented word2vec on my corpus using the TensorFlow tutorial: https://www.tensorflow.org/tutorials/text/word2vec#next_steps Now I'm want to give a sentence as input and want to find a similar sentence in the corpus.

Any leads on how I can perform this?

Does this answer your question? [How to calculate the sentence similarity using word2vec model of gensim with python](https://stackoverflow.com/questions/22129943/how-to-calculate-the-sentence-similarity-using-word2vec-model-of-gensim-with-pyt) — justanyphil, Feb 05 '21 at 09:53

justanyphil · Accepted Answer · 2021-02-04T13:28:12.557

A simple word2vec model is not capable of such task, as it only relates word semantics to each other, not the semantics of whole sentences. Inherently, such a model has no generative function, it only serves as a look-up table.

Word2vec models map word strings to vectors in the embedding space. To find similar words for a given sample word, one can simply go through all vectors in the vocabulary and find the ones that are closest (in terms of the 2-norm) from the sample word vector. For further information you could go here or here.

However, this does not work for sentences as it would require a whole vocabulary of sentences of which to pick similar ones - which is not feasible.

Edit: This seems to be a duplicate of this question.

How to find similar sentence from a corpus on word2vec?

1 Answers1