0

My goal is (1) Load a pre-trained word embedding matrix from a file as the initial value; (2) Fine tune the word embedding instead of keeping it fixed; (3) Each time I restore the model, load the fine-tuned word embedding instead of the pre-trained one.

I have tried sth like:

class model():
    def __init__(self):
    # ...
    def _add_word_embed(self):
        W = tf.get_variable('W', [self._vsize, self._emb_size], 
                 initializer=tf.truncated_normal_initializer(stddev=1e-4))
        W.assign(load_and_read_w2v())
        # ...
    def _add_seq2seq(self):
        # ...
    def build_graph(self):
        self._add_word_embed()
        self._add_seq2seq()

But this approach would cover the fine-tuned word embedding whenever I stop the training and restart it. I also tried sess.run(W.assign()) after calling model.build_graph. But it threw an error that the graph has been finalized and I can not change it anymore. Could you please tell me the right way to achieve it? Thanks in advance!

EDIT:

This question is not duplicated as IT HAS A NEW REQUIREMENT: USE THE PRE-TRAINED WORD EMBEDDING AT THE BEGINNING OF TRAINING AND FIND-TUNE IT AFTERWARDS. I ALSO ASK HOW TO DO THIS EFFICIENTLY. THE ACCEPTED ANSWER IN THAT QUESTION IS NOT FXXKING SATISFIED WITH THIS REQUIREMENT. CAN YOU THINK TWICE BEFORE YOU MARK ANY QUESTION AS DUPLICATED ???????????

user5779223
  • 1,460
  • 3
  • 21
  • 42

1 Answers1

3

Here is a toy example on how to do it:

# The graph

# Inputs
vocab_size = 2
embed_dim = 2
embedding_matrix = np.ones((vocab_size, embed_dim))

#The weight matrix to initialize with embeddings
W = tf.get_variable(initializer=tf.zeros([vocab_size, embed_dim]), name='embed', trainable=True)

# global step used to take care of the weight initialization 
# for the first time will be loaded from numpy array and not during retraining.
global_step = tf.Variable(0, dtype=tf.int32, trainable=False, name='global_step')

# Initialiazation of weights based on global_step
initW = tf.cond(tf.equal(global_step, 0), lambda:W.assign(embedding_matrix), lambda: W)
inc = tf.assign_add(W,[[1, 1],[1, 1]])

# Update global step
update = tf.assign_add(global_step, 1)
op = tf.group(inc, update)

# init_fn 
def init_embed(sess):
  sess.run(initW)

Now if we run the above in a session:

sv = tf.train.Supervisor(logdir='tmp',init_fn=init_embed)
with sv.managed_session() as sess:
   print('global step:', sess.run(global_step))
   print('Initial weight:')
   print(sess.run(W))
   for i in range(2):  
      sess.run([op])
    _ W, g_step= sess.run([W, global_step])
   print('Final weight:')        
   print(_W)
   sv.saver.save(sess,sv.save_path, global_step=g_step)

# Output at first run
   Initial weight:
   [[ 1.  1.]
   [ 1.  1.]]

   Final weight:
   [[ 3.  3.]
   [ 3.  3.]]

#Output at second run
   Initial weight:
   [[ 3.  3.]
   [ 3.  3.]]
   Final weight:
   [[ 5.  5.]
   [ 5.  5.]]
Vijay Mariappan
  • 16,921
  • 3
  • 40
  • 59