Dimensions not matching in keras LSTM model

Question

I want to use an LSTM neural Network with keras to forecast groups of time series and I am having troubles in making the model match what I want. The dimensions of my data are:

input tensor: (data length, number of series to train, time steps to look back)

output tensor: (data length, number of series to forecast, time steps to look ahead)

Note: I want to keep the dimensions exactly like that, no transposition.

A dummy data code that reproduces the problem is:

import numpy as np
from keras.models import Sequential
from keras.layers import Dense, TimeDistributed, LSTM

epoch_number = 100
batch_size = 20
input_dim = 4
output_dim = 3
look_back = 24
look_ahead = 24
n = 100

trainX = np.random.rand(n, input_dim, look_back)
trainY = np.random.rand(n, output_dim, look_ahead)
print('test X:', trainX.shape)
print('test Y:', trainY.shape)

model = Sequential()

# Add the first LSTM layer (The intermediate layers need to pass the sequences to the next layer)
model.add(LSTM(10, batch_input_shape=(None, input_dim, look_back), return_sequences=True))

# add the first LSTM layer (the dimensions are only needed in the first layer)
model.add(LSTM(10, return_sequences=True))

# the TimeDistributed object allows a 3D output
model.add(TimeDistributed(Dense(look_ahead)))

model.compile(loss='mean_squared_error', optimizer='adam', metrics=['accuracy'])
model.fit(trainX, trainY, nb_epoch=epoch_number, batch_size=batch_size, verbose=1)

This trows:

Exception: Error when checking model target: expected timedistributed_1 to have shape (None, 4, 24) but got array with shape (100, 3, 24)

The problem seems to be when defining the TimeDistributed layer.

How do I define the TimeDistributed layer so that it compiles and trains?

score 0 · Answer 1 · answered Oct 17 '16 at 18:29

The error message is a bit misleading in your case. Your output node of the network is called timedistributed_1 because that's the last node in your sequential model. What the error message is trying to tell you is that the output of this node does not match the target your model is fitting to, i.e. your labels trainY.

Your trainY has a shape of (n, output_dim, look_ahead), so (100, 3, 24) but the network is producing an output shape of (batch_size, input_dim, look_ahead). The problem in this case is that output_dim != input_dim. If your time dimension changes you may need padding or a network node that removes said timestep.

score 0 · Answer 2 · answered Oct 17 '16 at 18:45

I think the problem is that you expect output_dim (!= input_dim) at the output of TimeDistributed, while it's not possible. This dimension is what it considers as the time dimension: it is preserved.

The input should be at least 3D, and the dimension of index one will be considered to be the temporal dimension.

The purpose of TimeDistributed is to apply the same layer to each time step. You can only end up with the same number of time steps as you started with.

If you really need to bring down this dimension from 4 to 3, I think you will need to either add another layer at the end, or use something different from TimeDistributed.

PS: one hint towards finding this issue was that output_dim is never used when creating the model, it only appears in the validation data. While it's only a code smell (there might not be anything wrong with this observation), it's something worth checking.

I am doing the transposition because for a single time series, this transposition makes the forecast much more accurate. I got the idea following this tutorial: http://machinelearningmastery.com/time-series-prediction-lstm-recurrent-neural-networks-python-keras/ But I guess there's some more work to do for many-to-many relations — Santi Peñate-Vera, Oct 18 '16 at 09:59

Dimensions not matching in keras LSTM model

2 Answers2

Linked