6

I am trying to use the FastText's french pre-trained binary model (downloaded from the official FastText's github page). I need the .bin model and not the .vec word-vectors so as to approximate misspelled and out-of-vocabulary words.

However when I try to load said model, using:

from gensim.models import FastText
model = FastText.load_fasttext_format('french_bin_model_path')

I get the following error:

NotImplementedError: Supervised fastText models are not supported

What is surprising is that it works just fine when I try to load the english binary model.

I am running python 3.6 and gensim 3.5.0.

Any idea as of why it doesn't work with french vectors are welcome!

Community
  • 1
  • 1
Clara-sininen
  • 191
  • 2
  • 9

2 Answers2

5

I ran into the same problem and ended up using Facebook python wrapper for FastText instead of gensim's implementation.

import fastText 
model = fastText.load(path_to_french_bin)

Then you can get word vectors for out-of-vocabulary words like so:

oov_vector = model.get_word_vector(oov_word)

As for why gensim's load_fasttext_format works for the English model and not the French one I don't know!

efont
  • 246
  • 1
  • 9
0

I never used FastText but the problem might be the encoding of your file. Try to change it to Utf-8 if you are macOS or to Latin-1 if you are on Windows.

Vincent Quirion
  • 389
  • 2
  • 4
  • 16