3

Given the code below

encoder_inputs = Input(shape=(16, 70))
encoder = LSTM(latent_dim, return_state=True)
encoder_outputs, state_h, state_c = encoder(encoder_inputs)
# We discard `encoder_outputs` and only keep the states.
encoder_states = [state_h, state_c]

# Set up the decoder, using `encoder_states` as initial state.
decoder_inputs = Input(shape=(59, 93))
# We set up our decoder to return full output sequences,
# and to return internal states as well. We don't use the
# return states in the training model, but we will use them in inference.
decoder_lstm = LSTM(latent_dim, return_sequences=True, return_state=True)
decoder_outputs,_,_ = decoder_lstm(decoder_inputs,
                                     initial_state=encoder_states)
decoder_dense = TimeDistributed(Dense(93, activation='softmax'))
decoder_outputs = decoder_dense(decoder_outputs)

# Define the model that will turn
# `encoder_input_data` & `decoder_input_data` into `decoder_target_data`
model = Model([encoder_inputs, decoder_inputs], decoder_outputs)

if I change

decoder_dense = TimeDistributed(Dense(93, activation='softmax'))

to

decoder_dense = Dense(93, activation='softmax')

it still work, but which method is more effective?

william007
  • 17,375
  • 25
  • 118
  • 194

1 Answers1

7

If your Data is dependent on Time, like Time Series Data or the data comprising different frames of a Video, then Time Distributed Dense Layer is effective than simple Dense Layer.

Time Distributed Dense applies the same dense layer to every time step during GRU/LSTM Cell unrolling. That’s why the error function will be between the predicted label sequence and the actual label sequence.

Using return_sequences=False, the Dense layer will get applied only once in the last cell. This is normally the case when RNNs are used for classification problems.

If return_sequences=True, then the Dense layer is used to apply at every timestep just like TimeDistributedDense.

In your models both are the same, but if u change your second model to return_sequences=False, then the Dense will be applied only at the last cell.

Hope this helps. Happy Learning!