0

I have trouble understanding tensor behaviour in LSTM layers in keras.

I have preprocessed numeric data that looks like [samples, time steps, featues]. So 10 000 samples, 24 time steps and 10 predictors.

I want to stack residual connections but I am not sure that I am doing it right:

x <- layer_input(shape = c(24,10))

x <- layer_lstm(x,units=32,activation="tanh",return_sequences=T)

Now shape of x, which is tensor, is [?,?,32]. I was expecting [?,32,10]. Should I reshape the data to be [samples, features, time steps]? Then I form res:

y <- layer_lstm(x,units=32,activation="tanh",return_sequences=T)

res <- layer_add(c(x, y))

Now I am not sure if this is correct, or maybe should I go with that instead

x <- layer_input(shape = c(24,10))

y <- layer_lstm(x,units=24,activation="tanh",return_sequences=T) # same as time_steps

res <- layer_add(c(x,y)) ## perhaps here data reshaping is neccesary?

Any insight is much appreciatied.

JJ

JacobJacox
  • 917
  • 5
  • 14

1 Answers1

1

LSTM layer will return you dims as (?,seq_length,out_dims), where out_dims is units in your case. So overall dims will be as

x <- layer_input(shape = c(24,10))
# dims of x (?,24,10)
x <- layer_lstm(x,units=32,activation="tanh",return_sequences=T)
# dims of x after lstm_layer (?,24,32)

y <- layer_lstm(x,units=32,activation="tanh",return_sequences=T)
# dims of y (?,24,32)
res <- layer_add(c(x, y))
# dims of res will be (?,24,32), it is addion of output of both lstm_layer.

For more info, you can check-this

Ankish Bansal
  • 1,827
  • 3
  • 15
  • 25
  • So LSTM will take all 10 predictors and combine a sequence of 24 in which will be information of all 10 predictors? Did I understood you correctly? @Ankish Bansal – JacobJacox Jan 16 '19 at 14:32
  • Yeah, the sequence here can be seen as time steps, so it takes 10 predictors at each time step and compute output of 32 dims and repeat for 24 time steps – Ankish Bansal Jan 16 '19 at 14:50
  • I will just one more sub question. So if I am concerned that I am losing information in deeper layers, which are stacked "y" with another "res", I should save the first "x" in our case and then add it later on? @Ankish Bansal – JacobJacox Jan 16 '19 at 16:02
  • I didn't get it, are you saying, what if you add another res block? – Ankish Bansal Jan 16 '19 at 16:06
  • Also, it is loose word to use residual layer, but it is more of stacking two layer. Residual is used, when we actually use residual error as objective for layer output. – Ankish Bansal Jan 16 '19 at 16:10