1

I want to analyze the vectors looking for patterns and stuff, and use SVM on them to complete a classification task between class A and B, the task should be supervised. (I know it may sound odd but it's our homework.) so as a result I really need to know:

1- how to extract the coded vectors of a document using a trained model?

2- how to interpret them and how does word2vec code them?

I'm using gensim's word2vec.

Farhood
  • 391
  • 2
  • 4
  • 16
  • 1
    If you are trying to categorize whole documents, you should check Doc2Vec model which is also available in gensim library. The (little bit outdated) tutorial is here: https://rare-technologies.com/doc2vec-tutorial/ and be sure to check my answer here with up-to-date version: http://stackoverflow.com/questions/31321209/doc2vec-how-to-get-document-vectors/39329194#39329194 – Lenka Vraná May 11 '17 at 15:25

1 Answers1

2
  1. If you have trained word2vec model, you can get word-vector by __getitem__ method

    model = gensim.models.Word2Vec(sentences) print(model["some_word_from_dictionary"])

  2. Unfortunately, embeddings from word2vec/doc2vec not interpreted by a person (in contrast to topic vectors from LdaModel)

P/S If you have texts at the object in your tasks, then you should use Doc2Vec model

Ivan Menshikh
  • 144
  • 2
  • 9