0

I'm building a many to many network in Keras, using an LSTM. I have sequences of varying length (labels always have the same length as the sequence they describe). To handle the varying length and after searching on other SO posts I've found padding + masking to be the best solution.

This is my model :
enter image description here

So I have n (874) samples of max_len (24) padded sequences with 25 features each. But how do I handle my labels ? Do I pad them too ?

If I pad them like in the same way as my X (with the same special value) I get this :
X_train shape : (873, 24, 25)
y_train shape : (873, 24)

All is fine except I get the following error : ValueError: Can not squeeze dim[1], expected a dimension of 1, got 24 for '{{node binary_crossentropy/weighted_loss/Squeeze}} = Squeeze[T=DT_FLOAT, squeeze_dims=[-1]](Cast_1)' with input shapes: [1,24].

Searching up this error leads to post about removing retun_sequences=True from my LSTM layer, except I don't want that since each of my timesteps are labelled...

And if I don't pad them, they can't be converted to a tensor to be used by tensorflow.

Edit:
Explanatory illustration of the architecture I want to achieve, courtesy of this answer :https://stackoverflow.com/a/52092176/7732923
enter image description here

Hugo
  • 41
  • 3

1 Answers1

0

Problem found :

X_train shape : (873, 24, 25)
y_train shape : (873, 24)

y_train contained 873 samples of length 24, with one label for each timestep as I said, but, probably due to the possibility of wanting multi-label classification, each label for each timestep must be contained in a list, so the right shape for y_train must be :

y_train shape : (873, 24, 1)

So it was just about wrapping each label between [] during preprocessing, the architecture is sound and works (and now I'm left to determine how well but that's another beast ahah)

Hugo
  • 41
  • 3