1

I am trying to make a 3 sequence many-to-many LSTM model, but I am confused about it's implementation in Keras. I searched on internet for examples of many-to-many models, but each website gives different method. That has confused me even more. What is the correct method of those? I want a model like this: enter image description here

Some of the various methods I found were

  1. Using encoder, decoder
from keras.layers import RepeatVector
from keras.layers import TimeDistributed

model = Sequential()

# encoder layer
model.add(LSTM(100, activation='relu', input_shape=(3, 1)))

# repeat vector
model.add(RepeatVector(3))

# decoder layer
model.add(LSTM(100, activation='relu', return_sequences=True))

model.add(TimeDistributed(Dense(1)))
model.compile(optimizer='adam', loss='mse')
  1. Another with encoder, decoder
from keras.models import Model
from keras.layers import Input, LSTM, Dense

encoder_inputs = Input(shape=(None, 1))
encoder = LSTM(100, return_state=True)
encoder_outputs, state_h, state_c = encoder(encoder_inputs)
encoder_states = [state_h, state_c]


decoder_inputs = Input(shape=(None, 1))

decoder_lstm = LSTM(100, return_sequences=True, return_state=True)
decoder_outputs, _, _ = decoder_lstm(decoder_inputs,
                                     initial_state=encoder_states)
decoder_dense = Dense(num_decoder_tokens, activation='softmax')
decoder_outputs = decoder_dense(decoder_outputs)


model = Model([encoder_inputs, decoder_inputs], decoder_outputs)

model = Sequential()
model.add(LSTM(100,input_shape=(3,1),return_sequences=True))
model.add(TimeDistributed(Dense(2)))
model.compile(optimizer='adam', loss='mse')
model = Sequential()
model.add(LSTM(100,input_shape=(3,1),return_sequences=True))
model.compile(optimizer='adam', loss='mse')

Which one of these is the correct method? which one will give the model like the one I want?

Shantanu Shinde
  • 932
  • 3
  • 23
  • 48

1 Answers1

0

You have to mention your problem statement first.

1 and 2 are best for neural machine translation problems. While 2 is superior because it is considering return states in LSTM layer. 3 is also a good architecture where logic from input to output is simple. 4 is a very basic architecture becuase nth output in the output array has knowledge about [0 to n-1th input, not later ones] also no fully connected (Dense) layer so even moderate logic cannot be learned here.

Sayan Dey
  • 771
  • 6
  • 13