1

I'm trying to modify this Tensorflow LSTM model to load this pre-trained GoogleNews word ebmedding GoogleNews-vectors-negative300.bin (or a tensorflow Word2Vec embedding would be just as good).

I've been reading examples on how to load a pre-trained word embedding into tensorflow (eg. 1: here, 2: here, 3: here and 4: here).

In the first linked example they can easily assign the embedding to the graph:

sess.run(cnn.W.assign(initW))

In the second linked example they create an embedding-wrapper variable:

with tf.variable_scope("embedding_rnn_seq2seq/rnn/embedding_wrapper", reuse=True):
        em_in = tf.get_variable("embedding")

then they initialize the embedding wrapper:

sess.run(em_in.assign(initW))    

Both those examples make sense, but it's not obvious to me how I can assign the unpacked embedding initW to the TF graph in my case. (I'm a TF beginner).

I can prepare initW like the first two examples:

def loadEmbedding(self, word_to_id):
    # New model, we load the pre-trained word2vec data and initialize embeddings
    with open(os.path.join('GoogleNews-vectors-negative300.bin'), "rb", 0) as f:
        header = f.readline()
        vocab_size, vector_size = map(int, header.split())
        binary_len = np.dtype('float32').itemsize * vector_size
        initW = np.random.uniform(-0.25,0.25,(len(word_to_id), vector_size))
        for line in range(vocab_size):
            word = []
            while True:
                ch = f.read(1)
                if ch == b' ':
                    word = b''.join(word).decode('utf-8')
                    break
                if ch != b'\n':
                    word.append(ch)
            if word in word_to_id:
                initW[word_to_id[word]] = np.fromstring(f.read(binary_len), dtype='float32')
            else:
                f.read(binary_len)
    return initW

From the solution in example 4, I thought I should be able to do something like

session.run(tf.assign(embedding, initW)).

If I try to add the line here like this when the session is initialized :

with sv.managed_session() as session:
        initializer = tf.random_uniform_initializer(-config.init_scale,
                                                    config.init_scale)
        session.run(tf.assign(m.embedding, initW))

I get the following error:

ValueError: Fetch argument <tf.Tensor 'Assign:0' shape=(10000, 300) dtype=float32_ref> cannot be interpreted as a Tensor. (Tensor Tensor("Assign:0", shape=(10000, 300), dtype=float32_ref, device=/device:CPU:0) is not an element of this graph.)

Update: I updated the code following Nilesh Birari's suggestion: Full code. It results in no improvement in validation or test set perplexity, it only improves training set perplexity.

Community
  • 1
  • 1
user3591836
  • 953
  • 2
  • 16
  • 29
  • What do you mean it doesn't work? What error do you get? – dantiston Mar 24 '17 at 06:22
  • @dantiston I updated the question with added details. – user3591836 Mar 25 '17 at 11:56
  • I haven't figured out the answer to your problem yet, but I think it is better to use `tf.get_variable(..., trainable=False)` instead of what you have. Did you try that? – dantiston Mar 25 '17 at 18:34
  • Also, what do you think `RNN.inputs.assign(initW)` is supposed to do? It looks like you haven't assigned anything to the variable `RNN`. – dantiston Mar 25 '17 at 18:35
  • RNN isn't defined, I just wanted to do the analogous to this line: https://gist.github.com/j314erre/b7c97580a660ead82022625ff7a644d8#file-train-py-L157, but that script defines cnn as the TF graph: https://gist.github.com/j314erre/b7c97580a660ead82022625ff7a644d8#file-train-py-L75. I don't see how to do the analogous here. For your first comment, you're saying replace my entire codeblock with something like inputs = tf.get_variable(..., trainable=False)? (Then I don't understand how it would map the words in the Google embedding to those in the training set.) – user3591836 Mar 26 '17 at 09:53

1 Answers1

1

Correct me if I am wrong, trying to answer with my limited understanding of tensorflow.

ValueError: Fetch argument <tf.Tensor 'Assign:0' shape=(10000, 300) dtype=float32_ref> cannot be interpreted as a Tensor. (Tensor Tensor("Assign:0", shape=(10000, 300), dtype=float32_ref, device=/device:CPU:0) is not an element of this graph.)

This simply states you are trying to initialize element of different graph, so I guess you need to be in same scope in which your graph is define. Just adjusting your embedding initialization code in same scope can solve the problem.

with tf.Graph().as_default():
    initializer = tf.random_uniform_initializer(-config.init_scale,
                                                config.init_scale)
    with tf.name_scope("Train"):
        train_input = PTBInput(config=config, data=train_data, name="TrainInput")
        with tf.variable_scope("Model", reuse=None, initializer=initializer):
            m = PTBModel(is_training=True, config=config, input_=train_input)
        tf.summary.scalar("Training Loss", m.cost)
        tf.summary.scalar("Learning Rate", m.lr)

    with tf.name_scope("Valid"):
        valid_input = PTBInput(config=config, data=valid_data, name="ValidInput")
        with tf.variable_scope("Model", reuse=True, initializer=initializer):
            mvalid = PTBModel(is_training=False, config=config, input_=valid_input)
        tf.summary.scalar("Validation Loss", mvalid.cost)

    with tf.name_scope("Test"):
        test_input = PTBInput(config=eval_config, data=test_data, name="TestInput")
        with tf.variable_scope("Model", reuse=True, initializer=initializer):
            mtest = PTBModel(is_training=False, config=eval_config,
                             input_=test_input)

    sv = tf.train.Supervisor(logdir=FLAGS.save_path)
    with sv.managed_session() as session:
        word2vec = loadEmbedding(word_to_id)
        session.run(tf.assign(m.embedding, word2vec))
        print("WORKED!!!")

I guess this should be only problem, as you can see in your first example, initialization are under same scope.

Nilesh Birari
  • 873
  • 7
  • 13