0

Background info:

I need to use TF 1.x to create an LSTM, and save the model.

I am using saver = tf.train.Saver(), which IS working in the sense that it saves four files that allow me to restore my graph and use my model to make predictions.

However, the predictions are as bad as the initial run of the model, instead of being as good as the final model. I don't know why, because I've used Saver successfully for my last assignment, and I'm using the same lines of code (updated with correct names), in the same order: define graph, declare saver, train model, call save_path = saver.save(sess, "my_model") after training but in the same session.

Anyway, that's all background. I've given up trying to figure out why Saver isn't working. I did realize that if I manually initialize my weights, for example, using:

W1 = tf.get_variable(initializer=tf.cast(tf.constant(my_init_W1), tf.float32), name='W1')

Then the saved model will use those weights. So I'm trying to train my model, manually save the weights, then run my script again - initializing using those manually saved weights, using Saver this time, which will save the initial model (which is the same as the final model at the end of my last training session).

However, I don't know how to manually initialize an LSTMcell or dynamic_rnn. I've tried everything I can think of, based on the Tensorflow manual, other questions on stackoverflow, and some blog posts, but nothing is working.

Here's how I add the cells to my graph:

lstm_cell = tf.nn.rnn_cell.LSTMCell(NUM_UNITS)

h_val, state_val = tf.nn.dynamic_rnn(lstm_cell, rnn_input, dtype=tf.float32)

My question:

How do I manually initialize these layers?

I've been trying to use initial_state, but apparently not correctly.

Currently I've been using:

if not load_previous:
    h_val, state_val = tf.nn.dynamic_rnn(lstm_cell, rnn_input, dtype=tf.float32)

if load_previous:
    h_val, state_val = tf.nn.dynamic_rnn(lstm_cell, rnn_input, dtype=tf.float32, initial_state = sval_f)

Where sval_f is the state_val saved from the previous training session. This generates an extremely long error message that I haven't deciphered yet.

UPDATE:

Ok, I think I've realized that the initial_state is actually a Tensor of size [Batch_size, cell_size] that represents the states of the hidden cells on the first batch, NOT the initial weights. What I need to find out is how to export the weights from the LSTMCell and dynamic_rnn, and how use those to manually initialize them during later sessions.

SECOND UPDATE

I've learned how to get the weights from the LSTM, using:

my_weights = lstm_cell.get_weights()

I can also set the weights of the LSTM to those values during training using:

lstm_cell.set_weights(my_weights)

However, I don't know how to INITIALIZE the LSTM with those weights, and it seems like Saver isn't going to save my weights unless the LSTM is initialized with them.

Joe
  • 662
  • 1
  • 7
  • 20
  • I voted to close my question, because I found a question with answer that already shows how to initialize the LSTMCell with user defined kernel and bias https://stackoverflow.com/questions/51804671/how-to-set-the-variables-of-lstmcell-as-input-instead-of-letting-it-create-it-in – Joe Apr 25 '20 at 13:28
  • However, this method, in which a subclass MyLSTMCell is created, did **NOT** solve my underlying problem, because TensorFlow Saver did not save the initialization for the LSTMCell. I think it may be because in MyLSTMCell, really the LSTMCell is still initialized to zero, then the kernel and bias are ADDED. But I'm not sure about that. – Joe Apr 25 '20 at 13:30

0 Answers0