When the LSTM state is reset

Question

I'm trying to implement a LSTM neural network using tensorflow to do keyword detection. I fed the neural network with sequences of 400ms. However, during training, I don't want the LSTM to remember the sequence 1 while trying to learn the sequence 6 for instance. So how can I reset the state of the LSTM during the training. Does the initial_state in argument of outputs, state = rnn.rnn(cell, inputs, initial_state=self._initial_state) allows to reset the memory of the LSTM once the entire batch is fed?

I tried to understand the implementation with this link:

https://github.com/tensorflow/models/blob/master/tutorials/rnn/ptb/ptb_word_lm.py

I am not sure to understand your goal, but you can reset the state when you run the network. Say `network = tf.rnn.rnn_cell.MultiRNNCell(...)`, then running `network(input, current_state)` produces an output and a new state. You can ignore this new state and input a `current_state` again, for example. — Eric Platon, Aug 02 '16 at 03:47
Yes, sorry, maybe I wasn't clear enough. When I trained the LSTM network, I fed it with a list ([batch_size, 39_MFCC]) of length n_timesteps. I would like to know if the state will be reset every n_timesteps if I implemented : `multi_lstm = rnn_cell.MultiRNNCell([lstm_cell] * config.num_layers) init_state = multi_lstm.zero_state(batch_size, tf.float32) outputs, states = rnn.rnn(multi_lstm, tmp, dtype=tf.float32, initial_state = init_state)` — Chris, Aug 02 '16 at 07:53
There is no reason for the state to be reset, unless you do it explicitly. I do not see the `n_timesteps` in your code---would you refer to the truncated back-propagation? — Eric Platon, Aug 02 '16 at 10:58
My code looks like this one : https://github.com/aymericdamien/TensorFlow-Examples/blob/master/examples/3_NeuralNetworks/recurrent_network.py. Yes, when the network is trained on the sequence 6, I don't want the network to take account of the result on the sequence 1 for instance. That's why I would like to reset the states every n_timesteps which is the length of the sequence — Chris, Aug 02 '16 at 12:18

score 2 · Answer 1 · edited May 23 '17 at 12:00

2

In the ptb_word_lm.py self._initial_state is set only once in the whole program:

self._initial_state = cell.zero_state(batch_size, data_type())

This means, it remains a constant zero vector. So the initial state for the unrolling of the LSTM is always zero. You don't need to explicitly reset the memory after a batch is fed.

If you would like to manually update the LSTM's state / self._initial_state you need to define it as a Variable instead of a Tensor. See the answers here for more info.

edited May 23 '17 at 12:00

Community

1
1

answered Dec 20 '16 at 10:44

Kilian Obermeier

6,678
4
38
50

If this were true, wouldn't it mean that passing a batch twice through the lstm (wihout training) would produce same output ? – npit Jun 03 '17 at 12:32
1

@npit For stateless LSTMs, yes. – Kilian Obermeier Jun 03 '17 at 16:26

When the LSTM state is reset

1 Answers1