0

I'm building a Recurrent Auto-encoder to make feature extraction on some time series sequences. All this sequences have different lengths and looking at different tutorials it seems that LSTM input layer accept only data in a format like (nb_sequence, nb_timestep, nb_feature) with the same timestep for all the sequences.

My model look like this:

encoder=Sequential([
    LSTM(128,activation="relu",input_shape=(timesteps,n_features),return_sequences=True),
    LSTM(64,activation="relu",return_sequences=False)
],name="Encoder")

decoder=Sequential([
    RepeatVector(timesteps,input_shape=[64]),
    LSTM(64,activation="relu",return_sequences=True),
    LSTM(128,activation="relu",return_sequences=True),
    TimeDistributed(Dense(n_features))
],name="Decoder")

autoencoder=Sequential([encoder,decoder])

Is it possible to train my model with different sequences with different lengths ? If yes, how have to proceed ?

  • Take a look at https://stackoverflow.com/questions/51030782/why-do-we-pack-the-sequences-in-pytorch – Lodinn Jan 03 '23 at 16:45
  • At a lower level, inputs have to be collated into a single tensor for training (`collate_fn` in Torch's dataloaders). You can't really train on variable size inputs, especially not with with fully connected layers. The only way around it is batch size=1 inputs (SGD). – Lodinn Jan 03 '23 at 16:52
  • @Lodinn, so you are telling me the only way is to proceed with 1 sequence at time with SGD, how can affect my model if i proceed in that way ? There are some others ways to do FE with DL models that you can suggest me? Thank you in advance – Nathaldien Jan 03 '23 at 21:33
  • No, that's not the *only* way per se - see e.g. https://stackoverflow.com/a/58025761/10672371. Basically, you can pad your sequences to the same length and tell the model to ignore the padding or even do nothing and train as-is (which is common-ish in image processing). Training with SGD doesn't affect the model itself, training strategy is not a part of the architecture, although it might as well be. With SGD, it might not converge any well. With padding, it might learn the padding. – Lodinn Jan 03 '23 at 23:57
  • One more thing you could do is training encoder and decoder with different optimizers (see e.g. [this](https://www.tensorflow.org/guide/keras/customizing_what_happens_in_fit/#going_lower-level)). Probably [masking](https://www.tensorflow.org/guide/keras/masking_and_padding) is indeed the best solution for your use case. – Lodinn Jan 04 '23 at 00:00
  • I applied the masking as you suggest and it seems it's working fine, but now i dont know very well how to translate the output of my autoencoder from vectorized(embedded) to time series. – Nathaldien Jan 04 '23 at 16:07

0 Answers0