Multiple outputs for multi step ahead time series prediction with Keras LSTM

Question

Following a similar question, I have a problem where I need to predict many steps ahead of 3 different time series. I managed to generate a network that given the past 7 values of 3 time series as input, predicts 5 future values for one of them. The input x has these dimensions:

(500, 7, 3): 500 samples, 7 past time steps, 3 variables/time series)

The target y has these dimensions:

(500, 5): 500 samples, 5 future time steps

The LSTM network is defined as:

model = Sequential()
model.add(LSTM(input_dim=3, output_dim=10,  return_sequences=True))
model.add(Dropout(0.2))
model.add(LSTM(50))
model.add(Dropout(0.2))
model.add(Dense(input_dim=10, output_dim=7))
model.add(Activation('linear'))
model.compile(loss='mae', optimizer='adam')

What if now I want to predict the values of 2 time series?

I tried the following code:

inputs = Input(shape=(7,3)) # 7 past steps and variables
m = Dense(64,activation='linear')(inputs)
m = Dense(64,activation='linear')(m)
outputA = Dense(1,activation='linear')(m)
outputB = Dense(1,activation='linear')(m)

m = Model(inputs=[inputs], outputs=[outputA, outputB])
m.compile(optimizer='adam', loss='mae')
m.fit(x,[y1,y2])

Where both y1 and y2 have the same dimensions as y (500, 5). But I obtain the following error:

"Error when checking target: expected dense_4 to have 3 dimensions, but got array with shape (500, 5)".

How should I reshape y1 and y2? Or should I have a different structure for the network?

You are using `(7,3)` as input shape, but, unlike the first example, you are not using an LSTM, which, as stated in the [documentation](https://keras.io/layers/recurrent/), reduces the dimensions of the input tensor from 3 to 2. To make this model work you have to add an LSTM (with `return_sequence=False`) or a flatten Layer before the output layers — gionni, Dec 01 '17 at 17:17
I added a flatten layer as: `flat = Flatten()(m) ; outputA = Dense(ahead,activation='linear')(flat) ; outputB = Dense(ahead,activation='linear')(flat)`. And now it does train, but how come now the training of the network is way faster? — Titus Pullo, Dec 01 '17 at 17:48
@gionni Would this network: `inputs = Input(shape=(7,6)) d1 = Dropout(0.2)(inputs) m = Dense(50,activation='linear')(d1) d2 = Dropout(0.2)(m) flat = Flatten()(d2) outputA = Dense(ahead,activation='linear')(flat) outputB = Dense(ahead,activation='linear')(flat) m = Model(inputs=[inputs], outputs=[outputA, outputB]) m.compile(optimizer='adam', loss='mae')` equivalent to the one in my first example? — Titus Pullo, Dec 01 '17 at 17:50
on the first comment: it's faster because you don't have the LSTM layer, which is slow to train, while the Flatten layer is just doing a reshaping of the input tensor. Similarly, for the second comment, it would not be the same since you have no LSTM layer. — gionni, Dec 01 '17 at 17:56
Thanks. Can you have multiple outputs with the LSTM layer? I mean, could I re-use my first network? (Sorry, but totally new to LSTM). — Titus Pullo, Dec 01 '17 at 18:04

score 1 · Accepted Answer · answered Dec 01 '17 at 18:19

Following on the comment, in which I couldn't post readable code:

If you want to train your net on 2 output, keeping an architecture close to the one of the second net you posted, but using an LSTM, this should work:

from keras.layers import Input, Dense, Dropout, LSTM

inputs = Input(shape=(7,3)) # 7 past steps and variables
m = LSTM(10,  return_sequences=True)(inputs)
m = Dropout(0.2)(m)
m = LSTM(50)(m)
m = Dropout(0.2)(m)
outputA = Dense(5, activation='linear')(m)
outputB = Dense(5, activation='linear')(m)

m = Model(inputs=[inputs], outputs=[outputA, outputB])
m.compile(optimizer='adam', loss='mae')
m.fit(x,[y1,y2])

Note that this architecture will give good results if the time dependencies in the 2 time series you are predicting are similar, since you will be using the same LSTM layers to process both and just split at the last layer, which will be doing a sort of fine tuning of the results for each time series. Another choice would be to use 2 net like the first one you proposed, but that would double the computational effort.

Yet another option is to have the LSTM output multiple values directly. The basic idea is to keep your first model with return_sequence=True in the second LSTM layer. The problem here is that if you want to keep 7 time steps as input and get only 5 as output, you need to slice your tensor somewhere in between the first LSTM layer and the output layer, so that you reduce the output timesteps to 5. The problem is that there is no implemented slice layer in keras. This is a custom layer that could work to slice. Also I'm not sure this architecture is valid, theoretically speaking.

One final note: instead of slicing you could transpose the layer, use a dense to reduce the desired dimension, and transpose back to the original dimensions, or similarly use Flatten -> Dense and reshape. Both this option will give you a valid architecture (meaning that keras will compile and fit), but in both cases you would be messing with the time dimension, which is not advisable.

Hope this help

Thanks for the exhaustive comment @gionni. I think your first solution would fit better my needs (not really familiar with keras, I am a newbie). I'll try to play with it to see if it gives me good predictions. — Titus Pullo, Dec 02 '17 at 10:55

Multiple outputs for multi step ahead time series prediction with Keras LSTM

1 Answers1