Best way to handle OOV words when using pretrained embeddings in PyTorch

Question

I am using word2vec pretrained embedding in PyTorch (following code here). However, it does not seem to handle unseen words. Is there any good way to solve it?

score 1 · Answer 1 · answered Dec 11 '18 at 05:47

1

FastText builds character ngram vectors as part of model training. When it finds an OOV word, it sums the character ngram vectors in the word to produce a vector for the word. You can find more detail here.

answered Dec 11 '18 at 05:47

polm23

14,456
7
35
59

Best way to handle OOV words when using pretrained embeddings in PyTorch

1 Answers1