Why is TimeDistributed not needed in my Keras LSTM?

Question

I know there are a lot of questions to this topic, but I don't understand why in my case both options are possible. My input shape in the LSTM is (10,24,2) and my hidden_size is 8.

model = Sequential()    
model.add(LSTM(hidden_size, return_sequences=True, stateful = True, 
               batch_input_shape=((10, 24, 2))))
model.add(Dropout(0.1))

Why is it possible to either add this line below:

model.add(TimeDistributed(Dense(2))) # Option 1

or this one:

model.add(Dense(2)) # Option 2

Shouldn't Option 2 lead to a compilation error, because it expects a two-dimensional input?

score 7 · Accepted Answer · answered Apr 05 '19 at 10:49

In your case the 2 models you define are identical.

This is caused by the fact that you use the return_sequences=True parameter which means that the Dense layer is applied to every timestep just like TimeDistributedDense but if you switch to False then the 2 models are not identical and an error is raised in case of TimeDistributedDense version though not in the Dense one.

A more thorough explanation is provided here also to a similar situation.

Why is TimeDistributed not needed in my Keras LSTM?

1 Answers1