Sorry I am new to RNN. I have read this post on TimeDistributed layer.
I have reshaped my data in to Keras requried [samples, time_steps, features]
: [140*50*19]
, which means I have 140 data points, each has 50 time steps, and 19 features. My output is shaped [140*50*1]
. I care more about the last data point's accuracy. This is a regression problem.
My current code is :
x = Input((None, X_train.shape[-1]) , name='input')
lstm_kwargs = { 'dropout_W': 0.25, 'return_sequences': True, 'consume_less': 'gpu'}
lstm1 = LSTM(64, name='lstm1', **lstm_kwargs)(x)
output = Dense(1, activation='relu', name='output')(lstm1)
model = Model(input=x, output=output)
sgd = SGD(lr=0.00006, momentum=0.8, decay=0, nesterov=False)
optimizer = sgd
model.compile(optimizer=optimizer, loss='mean_squared_error')
My questions are:
- My case is many-to-many, so I need to use
return_sequences=True
? How about if I only need the last time step's prediction, it would be many-to-one. So I need to my output to be[140*1*1]
andreturn_sequences=False
? - Is there anyway to enhance my last time points accuracy if I use many-to-many? I care more about it than the other points accuracy.
I have tried to use TimeDistributed layer as
output = TimeDistributed(Dense(1, activation='relu'), name='output')(lstm1)
the performance seems to be worse than without using TimeDistributed layer. Why is this so?
- I tried to use
optimizer=RMSprop(lr=0.001)
. I thoughtRMSprop
is supposed to stabilize the NN. But I was never able to get good result usingRMSprop
. - How do I choose a good
lr
and momentum forSGD
? I have been testing on different combinations manually. Is there a cross validation method in keras?