4

I am trying to understand the concept of LSTM layers in Keras. I just want to confirm some behavior of LSTM and check if i understand it correctly.

Assuming that I have 1000 samples and this samples have 1 time step and i have a batch size of 1 when

stateful = True

Is this the same as 1 sample with 1000 time steps and a batch size of 1 with

stateful = False

Here I am also assuming that in both cases i have the same information just in different shapes and i reset the state of my LSTM layer after every training epoch.

I also think that the batch size in the stateless case only matters for my training sequence, because if i set

stateful = False 

i can use input_shape instead of batch_input_shape. So my LSTM layer does not need a batch dimension only time steps and feature dimensions. is this correct?

i got this conclusions from:

https://github.com/keras-team/keras/blob/master/keras/layers/recurrent.py#L1847

When does keras reset an LSTM state?

Understanding Keras LSTMs

And if i have a multi layer LSTM net if the first LSTM layer is stateful, all other layers should also be stateful right?

I hope somebody understands what i mean and can help me. If my questions are not understandable please tell me and i will update this post.

Thanks everybody.

D.Luipers
  • 127
  • 1
  • 11

1 Answers1

3

stateful=True means that you keep the final state for every batch and pass it as initial state for the next batch. So yes, in this case it's the same if you have 1 batch of 1000 samples or 1000 batches of 1 sample.

  • Ok i see the difference, but in both cases i would have 1 sequence which has 1000 time steps wouldn't i? because in the stateless case i my only sample or sequence has 1000 time steps and in the stateful case LSTM would see the 1 time steps in my 1000 sequences as 1 sequence right? – D.Luipers Oct 26 '18 at 14:51
  • stateless LSTM does not exist. If you don't have states then it's just a common neural network. And for the 'stateful' case, it's different because with 1 sequence of 1000 time steps you process 1000 different cells with different parameters, and with 1000 sequences you go through 1 cell only. Imagine your LSTM network like a tunnel. In the first case the tunnel is very long and you pass through only 1 time. In the other case, the tunnel is very short and you pass through it 1000 times. – Mael Galliffet Oct 26 '18 at 15:01
  • As stateless case i understood that this means return_state=False which is the standard setting in keras. i forgot to mention that i mean a many to one case but i thin i got the right idea how it works but i cant really put it in words. So in the first case when retun_state = True it is like i connect the short tunnels to one long tunnel right? So the states will be passed to every batch – D.Luipers Oct 26 '18 at 15:10
  • Just checked the documentation and it seems you are right, `stateful=True` means that you keep the final state for every batch and pass it as initial state for the next batch. So yes, in this case it's probably the same if you have 1 batch of 1000 samples or 1000 batches of 1 sample – Mael Galliffet Oct 29 '18 at 08:09
  • Ok if you could update your answer, so i can mark it as correct :) – D.Luipers Oct 30 '18 at 09:53