8

To get vector of a word, I can use:

model["word"]

but if I want to get the vector of a sentence, I need to either sum vectors of all words or get average of all vectors.

Does FastText provide a method to do this?

Aanchal1103
  • 917
  • 8
  • 21
Andrey
  • 633
  • 2
  • 8
  • 13

3 Answers3

10

If you want to compute vector representations of sentences or paragraphs, please use:

$ ./fasttext print-sentence-vectors model.bin < text.txt

This assumes that the text.txt file contains the paragraphs that you want to get vectors for. The program will output one vector representation per line in the file.

This has been clearly mentioned in the README of fasttext repo. https://github.com/facebookresearch/fastText

nickb
  • 59,313
  • 13
  • 108
  • 143
Aanchal1103
  • 917
  • 8
  • 21
  • is their another implementation using java . –  Apr 20 '17 at 09:47
  • AFAIK, fasttext supports only CLI for now. But, I was able to find a library that was the pythonic interface of fasttext. You can google to see if you can find one in java. – Aanchal1103 Apr 20 '17 at 10:01
  • i found one https://github.com/vinhkhuc/JFastText but has the same question of @Andrey. i should got the line by for loop then another loop for words to getvector for each one . but how can i got total . i couldn't find like the line you posted –  Apr 20 '17 at 10:03
  • jft.runCmd(new String[] { "supervised", "-input", "src/test/resources/data/labeled_data.txt", "-output", "src/test/resources/models/supervised.model" }); This snippet has been picked from the library you mentioned. You can use the command 'print-vectors' just like this, but you will have to figure out how to pass in the parameters as I don't know much about running commands from java code. – Aanchal1103 Apr 20 '17 at 10:17
  • thanks for replying , i should deal with data line by line as i'm using this in real time i think it will be false if i used the whole file once time , Right ? –  Apr 20 '17 at 10:20
  • 2
    No, the purpose of this 'print-vectors' command is to give you the vectors of all the lines in a file. If you see the command again 'text.txt' is a file that contains preprocessed data (i.e. one paragraph per line). You just have to put all your sentences in a file in the format specified and pass in that file to 'print-vectors' as an option. – Aanchal1103 Apr 20 '17 at 10:32
  • you mean that every call print-vectors means for each line not for all the lines in the file in once time –  Apr 20 '17 at 10:45
  • 3
    Okay this is getting really difficult to explain :P I'll try to explain in more simple words. When you call print-vectors, you provide it a file (your input file with lots of paragraphs or sentences and one line of the file is treated as one paragraph). You can have as many paragraphs in a file as you like. You have to call print-vectors only once and it will output the vectors of all the lines in the input file. I suggest you go through the Fasttext docs, everything has been mentioned there nicely. :) – Aanchal1103 Apr 20 '17 at 10:56
  • Many thanks Aanchal for helping and sorry for late reply . Still want to make sure that i got it well : i will call print-vectors only once and it will output vectors for all lines in a file like for-loop ? I'm really appreciate your help and patience –  Apr 21 '17 at 16:48
  • @AanchalSharma Thanks a lot for your great answer. Please let me know if you know an answer for this: https://stackoverflow.com/questions/46923066/fasttext-bigram-vs-sentence-word-vectors –  Oct 25 '17 at 02:40
2

You can use python wrapper also. Install it using official install guide from here: https://fasttext.cc/docs/en/python-module.html#installation

And after that:

import fasttext
model = fasttext.load_model('model.bin')
vect = model.get_sentence_vector("some string") # 1 sentence
vect2 = [model.get_sentence_vector(el.replace('\n', '')) for el in text] # for text
Mikhail_Sam
  • 10,602
  • 11
  • 66
  • 102
  • Note, this is pretty difficult to make work on windows computers. I'll suggest using [gensim](https://radimrehurek.com/gensim/models/fasttext.html) – CutePoison Feb 26 '23 at 07:41
0

To get vector for a sentence using fasttext, try the following command

$ echo "Your Sentence Here" | ./fasttext print-sentence-vectors model.bin

For an example on this, refer Learn Word Representations In Fasttext

arjun
  • 1,645
  • 1
  • 19
  • 19