0

I have a CNN-LSTM model where the CNN model takes as input data with shape (None, 301,4,1) and outputs data with shape (None, 606). To adapt the cnn output to the input of the LSTM, I added a TimeDistributed Layer where it call the CNN model each window-size=100, so the input shape of this layer=(None, 100,301,4,1), and then we have some stacked LSTM layers.

This is the architecture of the CNN model:

enter image description here

This is the architecture of the LSTM model:

enter image description here

The code for this architecture is the following :

input_layer1=Input(shape=(301,4,1)) 
... 
merge_layer=Concatenate(axis=1)([global_max_pooling, lambda_14]) 
cnn_model = Model(inputs=  input_layer1, outputs=merge_layer) cnn_model.compile(optimizer=RMSprop(),loss="mean_squared_error",metrics=['mse', 'mae'])

input_lstm = Input(shape=(100,301,4,1)) 
cnn_output = TimeDistributed(cnn_model)(input_lstm) 
... 
output_layer=Dense(1,activation="linear")(lstm3)     
cnn_lstm_model = Model(inputs= input_lstm, outputs=output_layer)
cnn_lstm_model.compile(optimizer=RMSprop(),loss="mean_squared_error",metrics=['mse', 'mae'])

Then saved only the cnn_lstm_model model.

For the training, this is my code:

batchsize=100
epoch=20
cnn_lstm_model.fit(train_data_force_temp_X,data_Y,
               batch_size=batchsize,
               epochs=epoch,
               verbose=1,
               shuffle=True,
               validation_data=(test_data_force_temp_X,test_Y),
               callbacks=[TensorBoard(log_dir="./CNN_LSTM")])

Where train_data_force_temp_X.shape = (1960, 301, 4, 1) , PS: 1960 is number of samples.

But I have this issue :

ValueError: Input 0 is incompatible with layer model_1: expected shape=(None, 100, 301, 4, 1), found shape=(None, 301, 4, 1)

I understood that passed the wrong shape to the cnn_lstm_model but I thought that it will pass the data first to the cnn model which have the shape=(None, 301, 4, 1) and then for each 100 CNN outputs it will call the time distributed layer and continue the process, It seems I have not understood the process correctly.

So my question is :

  • do I have to run the data first into the cnn model, do prediction and then used those outputs as an input for the cnn_lstm model ?

  • How can I fix the training process ?

Thank you in advance for the help.

el abed houssem
  • 350
  • 1
  • 7
  • 16
  • the input samples of cnn_lstm_model must have 100 timesteps otherwise there is no reason to apply CNN + TimeDistributed – Marco Cerliani Jul 05 '21 at 14:19
  • @MarcoCerliani thank you for the response. I know that cnn_lstm_model must have 100 timesteps, but what I understood from the time distribution layer is that it will call the cnn model 100 time and by doing that it will construct 100 timestep. but I thing that I am wrong, so do you have an idea how to fix this ? – el abed houssem Jul 05 '21 at 20:24
  • Yes, the TimeDistributed call the cnn model 100 times along with the temporal dimensionality but it doesn't construct anything. You have to rearrange your data in order to have a temporal dimensionality otherwise remove the TimeDistributed layer – Marco Cerliani Jul 05 '21 at 22:15

0 Answers0