0

related to this: How can I copy a variable in tensorflow

I am trying to copy the values of a lstm decoding units to use it elsewhere for beamsearch. in pseudo code, I would like something like this:

lstm_decode = tf.nn.rnn_cell(...)
training_output = tf.nn.seq2seq.rnn_decoder(...)
... do training by back-prop the error on trainint_output ...

# duplicate the lstm_decode unit (same weights)
lstm_decode_copy = copy(lstm_decode)
... do beam search with the duplicated lstm ...

The issue is that in tensorflow, the lstm variables are not generated during the call "tf.nn.rnn_cell(...)", but it is actually generated during the unrolling of the function call to rnn_decoder.

I could set the scope to the "tf.nn.seq2seq.rnn_decoder" function call, but the actual initialization of the lstm weights are not transparent to me. How might I capture these values and re-use them to make an lstm cell with the same weights as the ones learned?

thanks!

Community
  • 1
  • 1
Evan Pu
  • 2,099
  • 5
  • 21
  • 36

1 Answers1

0

I think I figured it out.

What you want is to set the scope for the decoder call to a particular value, say "decoding", in this line:

training_output = tf.nn.seq2seq.rnn_decoder(...scope="decoding")

And later on when you want to use the lstm units you learned during the decoding, you set the variable scope to "decoding" again, and use scope.reuse_variables() to allow re-using of the variables of the decoding. Then simply use the "lstm_decode" as you would otherwise, which will be bound to the same value as before.

with tf.variable_scope("decoding") as scope:
  scope.reuse_variables()
  ... use lstm_decode as usual ...

this way all the weights in the lstm_decode will be shared in these two sub-graphs, and whichever value it learned during the training will be present in the 2nd part as well.

Evan Pu
  • 2,099
  • 5
  • 21
  • 36