1

I am trying to implement RNN in Tensorflow for text prediction. I am using BasicLSTMCell for this purpose, with sequence lengths of 100.

If I understand correctly, the output activation h_t and the activation c_t of the LSTM are reset each time we enter a new sequence (that is, they are updated 100 times along the sequence, but once we move to the next sequence in the batch, they are reset to 0).

Is there a way to prevent this from happening using Tensorflow? That is, to continue using the updated c_t and h_t along all the sequences in the batch? (and then to reset them when moving to the next batch).

Miriam Farber
  • 18,986
  • 14
  • 61
  • 76
  • Possible duplicate of [this](http://stackoverflow.com/questions/38241410/tensorflow-remember-lstm-state-for-next-batch-stateful-lstm) – Kh40tiK Nov 27 '16 at 13:12

1 Answers1

3

I don't think you would want to do that since each example in the batch should be independent. If they are not you should just have a batch size of 1 and a sequence of length 100 * batch_size. Often you might want to save the state in between batches. In this case you would need to save the RNN state to a variable or as I like to do allow the user to feed it in with a placeholder.

chasep255
  • 11,745
  • 8
  • 58
  • 115
  • There's a `stateful_rnn` in Keras and it has a purpose, for example dealing with varying length sequences. – Kh40tiK Nov 27 '16 at 13:15
  • Yes but that saves it between batches like I said and it does not apply the state to the next sequence in the batch. – chasep255 Nov 27 '16 at 13:17
  • I agree most times `stateful_rnn` isn't needed for batch setting. However there are times batches of long and wide sequences doesn't fit well into memory. – Kh40tiK Nov 27 '16 at 13:21