Understanding word embeddings when converting from bag of words for tweets

Question

The exercise from intro to deep learning this assignment. It uses bag-of-words to represent a tweet.
How to use word embeddings to achieve the same?
I played around the word2vec tool, I came across following questions:

(i) How to obtain pre-trained embeddings to represent these tweets? (To use word2vec directly instead of training these tweets for embedding vectors.)How to use word2vec to use such pre-trained model?

(ii) How to train a tensorflow 2 hidden layer architecture once we obtain embeddings from word2vec (i.e. dimensions will change due to embedding_size) or (continuation of previous bow model what will be additional changes due to embeddings)
Previously it was:

input dimension : (None, vocab_size)
Layer-1: (input_data * weights_1) + biases_1 
Layer-2: (layer_1 * weights_2) + biases_2 
output layer: (layer_2 * n_classes) + n_classes
output dimension: (None, n_classes)

(iii) Is it necessary to obtain embeddings for given data of tweets by training word2vec from scratch? How to train data of around 14k tweets using word2vec (not gensim or GloVe)? Will word2vec preprocess @ as stopping word?

you may find this post useful. It doesn't solve your problem but touches on many of the same problems : http://stackoverflow.com/questions/35687678/using-a-pre-trained-word-embedding-word2vec-or-glove-in-tensorflow?rq=1 — Anton Codes, May 10 '17 at 02:16

Understanding word embeddings when converting from bag of words for tweets

0 Answers0