1

I'm trying to implement a LSTM neural network using tensorflow to do keyword detection. I fed the neural network with sequences of 400ms. However, during training, I don't want the LSTM to remember the sequence 1 while trying to learn the sequence 6 for instance. So how can I reset the state of the LSTM during the training. Does the initial_state in argument of outputs, state = rnn.rnn(cell, inputs, initial_state=self._initial_state) allows to reset the memory of the LSTM once the entire batch is fed?

I tried to understand the implementation with this link:

https://github.com/tensorflow/models/blob/master/tutorials/rnn/ptb/ptb_word_lm.py

Tonechas
  • 13,398
  • 16
  • 46
  • 80
Chris
  • 61
  • 1
  • 9
  • I am not sure to understand your goal, but you can reset the state when you run the network. Say `network = tf.rnn.rnn_cell.MultiRNNCell(...)`, then running `network(input, current_state)` produces an output and a new state. You can ignore this new state and input a `current_state` again, for example. – Eric Platon Aug 02 '16 at 03:47
  • Yes, sorry, maybe I wasn't clear enough. When I trained the LSTM network, I fed it with a list ([batch_size, 39_MFCC]) of length n_timesteps. I would like to know if the state will be reset every n_timesteps if I implemented : `multi_lstm = rnn_cell.MultiRNNCell([lstm_cell] * config.num_layers) init_state = multi_lstm.zero_state(batch_size, tf.float32) outputs, states = rnn.rnn(multi_lstm, tmp, dtype=tf.float32, initial_state = init_state)` – Chris Aug 02 '16 at 07:53
  • There is no reason for the state to be reset, unless you do it explicitly. I do not see the `n_timesteps` in your code---would you refer to the truncated back-propagation? – Eric Platon Aug 02 '16 at 10:58
  • My code looks like this one : https://github.com/aymericdamien/TensorFlow-Examples/blob/master/examples/3_NeuralNetworks/recurrent_network.py. Yes, when the network is trained on the sequence 6, I don't want the network to take account of the result on the sequence 1 for instance. That's why I would like to reset the states every n_timesteps which is the length of the sequence – Chris Aug 02 '16 at 12:18

1 Answers1

2

In the ptb_word_lm.py self._initial_state is set only once in the whole program:

self._initial_state = cell.zero_state(batch_size, data_type())

This means, it remains a constant zero vector. So the initial state for the unrolling of the LSTM is always zero. You don't need to explicitly reset the memory after a batch is fed.

If you would like to manually update the LSTM's state / self._initial_state you need to define it as a Variable instead of a Tensor. See the answers here for more info.

Community
  • 1
  • 1
Kilian Obermeier
  • 6,678
  • 4
  • 38
  • 50