TL;DR - I have a couple of thousand speed-profiles (time-series where the speed of a car has been sampled) and I am unsure how to configure my models such that I can perform arbitrary forecasting (i.e. predict t+n
samples given a sample t
).
I have read numerous explanations (1, 2, 3, 4, 5) about how Keras implements statefulness in their recurrent layers, and how one should reset/not reset between iterations, etc..
However, I am unable to acquire the model shape that I want (I think).
As for now, I am only working with a subset of my profiles (denoted as routes in the code below).
Number of training routes: 90
Number of testing routes: 10
The routes vary in length, hence, the first thing I do is to iterate through all routes and pad them with 0, so they are all the same length. (I have assumed this is required, if I am wrong please let me know.) After the padding I convert the routes into a format better suited for the supervised learning task, as described HERE. In this case I have opted to forecast the succeeding 5 steps of the current sample.
The result is a tensor, as:
Shape of trainig_data: (90, 3186, 6) == (nb_routes, nb_samples/route, nb_timesteps)
which is split into X
and y
for training as:
Shape of X: (90, 3186, 1)
Shape of y: (90, 3186, 5)
My goal is to have the model take one route
at the time and train on it. I have created a model like this:
# Create model
model = Sequential()
# Add recurrent layer
model.add(SimpleRNN(nb_cells, batch_input_shape=(1, X.shape[1], X.shape[2]), stateful=True))
# Add dense layer at the end to acquire correct kind of forecast
model.add(Dense(y.shape[2]))
# Compile model
model.compile(loss="mean_squared_error", optimizer="adam", metrics = ["accuracy"])
# Fit model
for _ in range(nb_epochs):
model.fit(X, y,
validation_split=0.1,
epochs=1,
batch_size=1,
verbose=1,
shuffle=False)
model.reset_states()
Which would imply that I have a model with nb_cells
layers, the input of the model is (number_of_samples, number_of_timesteps) i.e. (3186, 1) and the output of the model is (number_of_timesteps_lagged) i.e. (5).
However, when running the above I get the following error:
ValueError: Error when checking target: expected dense_1 to have 2 dimensions, but got array with shape (90, 3186, 5)
I have tried different ways to solve the above, but I have been unsuccessful.
I have also tried other ways of structuring my data and my model. For instance merging my routes such that instead of (90, 3186, 6) I had (286740, 6). I simply took the data for each route and put it after the other. After fiddeling with my model I got this to run, and I get a result that is quite good, but I really want to understand how this works - and I think the solution I am attempting above is bette (if I can get it to work).
Update
Note: I am still looking for feedback.
I have reached a "solution" which I think does the trick.
I have abandoned the padding and instead opted for a one sample at the time approach. The reason being that I am trying to acquire a network that allows me to predict
by providing the network with one sample at the time. I want to give the network sample t
and have it predict t+1, t+2, ...,t+n
, so it is my understanding that I must train the network on one sample at the time. I also assume that using:
stateful
will allow me to keep the hidden state of the cells unspoiled between batches (meaning that I can determine thebatch size
to belen(route)
)return_sequences
will allow me to get the output vector that I desire
The changed code is given below. Unlike the original question, the shape of the input data is now (90,)
(i.e. 90 routes of various length) but each training route still has only one feature per sample, and each label route has five samples per feature (the lagged time).
# Create model
model = Sequential()
# Add nn_type cells
model.add(SimpleRNN(nb_cells, return_sequences=True, stateful=True, batch_input_shape=(1, 1, nb_past_obs)))
# Add dense layer at the end to acquire correct kind of forecast
model.add(Dense(nb_future_obs))
# Compile model
model.compile(loss="mean_squared_error", optimizer="adam", metrics = ["accuracy"])
# Fit model
for e in range(nb_epochs):
for r in range(len(training_data)):
route = training_data[r]
for s in range(len(route)):
X = route[s, :nb_past_obs].reshape(1, 1, nb_past_obs)
y = route[s, nb_past_obs:].reshape(1, 1, nb_future_obs)
model.fit(X, y,
epochs=1,
batch_size=1,
verbose=0,
shuffle=False))
model.reset_states()
return model