1

The main purpose of the LSTM is to utilize its memory property. Based on that what is the point of a stateless LSTM to exist? Don’t we “convert” it into a simple NN by doing that?

In other words.. Does the stateless use of LSTM aim to model the sequences (window) in the input data - if we apply shuffle = False in the fit layer in keras - (eg. for a window of 10 time steps capture any pattern between 10-character words)? If yes why don’t we convert the initial input data to match the form of the sequencers under inspection and then use a plain NN?

If we choose to have shuffle = True then we are losing any information that could be found in our data (e.g. time series data - sequences), don't we? In that case I would expect in to behave similarly to a plain NN and get the same results between the two by setting the same random seed.

Am I missing something in my thinking?

Thanks!

mrt
  • 339
  • 1
  • 2
  • 14

1 Answers1

0

Data for keras LSTM models is always in the form of (batch_size, n_steps, n_features). When you use shuffle=True, you are going to shuffle on the batch_size argument, thereby retaining the natural order of the sequence that is n_steps long.

In cases that each of your batch_size number of sequences are unrelated to each other, it would be natural to use a stateless model. Each array still contains order (a time-series), but does not depend on the others in your batch.

Boudewijn Aasman
  • 1,236
  • 1
  • 13
  • 20
  • Let's say that my dataset has 100 samples and that the batch size is 10. When I do shuffle = True then keras first selects randomly the samples (now the 100 samples have a different order) and on the new order it will start creating the batches: batch 1: 1-10, batch 2: 11-20 etc. Isn't it how the shuffle works? – mrt Aug 07 '17 at 20:58
  • Yes that is correct. My point is that the actual sequences of n_steps are independent time series of themselves that never get shuffled. – Boudewijn Aasman Aug 08 '17 at 00:15