1

TL;DR - I have a couple of thousand speed-profiles (time-series where the speed of a car has been sampled) and I am unsure how to configure my models such that I can perform arbitrary forecasting (i.e. predict t+n samples given a sample t).


I have read numerous explanations (1, 2, 3, 4, 5) about how Keras implements statefulness in their recurrent layers, and how one should reset/not reset between iterations, etc..

However, I am unable to acquire the model shape that I want (I think).

As for now, I am only working with a subset of my profiles (denoted as routes in the code below).

Number of training routes: 90
Number of testing routes: 10

The routes vary in length, hence, the first thing I do is to iterate through all routes and pad them with 0, so they are all the same length. (I have assumed this is required, if I am wrong please let me know.) After the padding I convert the routes into a format better suited for the supervised learning task, as described HERE. In this case I have opted to forecast the succeeding 5 steps of the current sample.

The result is a tensor, as:

Shape of trainig_data: (90, 3186, 6) == (nb_routes, nb_samples/route, nb_timesteps)

which is split into X and y for training as:

Shape of X: (90, 3186, 1)
Shape of y: (90, 3186, 5)

My goal is to have the model take one route at the time and train on it. I have created a model like this:

# Create model 
model = Sequential()

# Add recurrent layer
model.add(SimpleRNN(nb_cells, batch_input_shape=(1, X.shape[1], X.shape[2]), stateful=True))

# Add dense layer at the end to acquire correct kind of forecast
model.add(Dense(y.shape[2]))

# Compile model
model.compile(loss="mean_squared_error", optimizer="adam", metrics = ["accuracy"])

# Fit model
for _ in range(nb_epochs):
    model.fit(X, y,
              validation_split=0.1,
              epochs=1,
              batch_size=1,
              verbose=1,
              shuffle=False)
    model.reset_states()                                                                                                                                   

Which would imply that I have a model with nb_cells layers, the input of the model is (number_of_samples, number_of_timesteps) i.e. (3186, 1) and the output of the model is (number_of_timesteps_lagged) i.e. (5).

However, when running the above I get the following error:

ValueError: Error when checking target: expected dense_1 to have 2 dimensions, but got array with shape (90, 3186, 5)

I have tried different ways to solve the above, but I have been unsuccessful.

I have also tried other ways of structuring my data and my model. For instance merging my routes such that instead of (90, 3186, 6) I had (286740, 6). I simply took the data for each route and put it after the other. After fiddeling with my model I got this to run, and I get a result that is quite good, but I really want to understand how this works - and I think the solution I am attempting above is bette (if I can get it to work).


Update

Note: I am still looking for feedback.

I have reached a "solution" which I think does the trick.

I have abandoned the padding and instead opted for a one sample at the time approach. The reason being that I am trying to acquire a network that allows me to predict by providing the network with one sample at the time. I want to give the network sample t and have it predict t+1, t+2, ...,t+n, so it is my understanding that I must train the network on one sample at the time. I also assume that using:

  • stateful will allow me to keep the hidden state of the cells unspoiled between batches (meaning that I can determine the batch size to be len(route))
  • return_sequences will allow me to get the output vector that I desire

The changed code is given below. Unlike the original question, the shape of the input data is now (90,) (i.e. 90 routes of various length) but each training route still has only one feature per sample, and each label route has five samples per feature (the lagged time).

# Create model
model = Sequential()

# Add nn_type cells
model.add(SimpleRNN(nb_cells, return_sequences=True, stateful=True, batch_input_shape=(1, 1, nb_past_obs)))

# Add dense layer at the end to acquire correct kind of forecast
model.add(Dense(nb_future_obs))

# Compile model
model.compile(loss="mean_squared_error", optimizer="adam", metrics = ["accuracy"])

# Fit model
for e in range(nb_epochs):
    for r in range(len(training_data)):
        route = training_data[r]
        for s in range(len(route)):
            X = route[s, :nb_past_obs].reshape(1, 1, nb_past_obs)
            y = route[s, nb_past_obs:].reshape(1, 1, nb_future_obs)
            model.fit(X, y,
                      epochs=1,
                      batch_size=1,
                      verbose=0,
                      shuffle=False))
    model.reset_states()

return model
GLaDER
  • 345
  • 2
  • 17
  • Sorry did this new amendment work?, it seemed like the issue was you had to flatten before your dense layer i.e **model.Flatten()**. – Simbarashe Timothy Motsi Feb 15 '18 at 06:37
  • @SimbarasheTimothyMotsi: I am not sure that *flattening* really will do the trick here. Maybe I missunderstand what `model.Flatten()` does (or what I have done...) but I actually tried that approach first but thought it was wrong. It is my understanding that *flattening* would take my list of routes and convert it into a list where all samples are consecutive, so (90, 3876, 1) turns into (348840, 1). This is allright w.r.t. input shape, but would this yield the same result w.r.t. the `stateful` ability of the network? (Wouldn't a flattened model with batch size `b` reset the state more often?) – GLaDER Feb 15 '18 at 06:58
  • **model.Flatten()** Flattens the input. Does not affect the batch size it is similar to numpy.reshap. Now the issue with RNN's is they require different input dimensions in comparison to lets say CNN's for example, so you need to flatten so it fits into your RNN. – Simbarashe Timothy Motsi Feb 15 '18 at 07:33
  • Are you saying I am missunderstanding what `model.Flatten()` or `stateful = True` achieves in my example above? I do not see how your comment answers either of the two questions I phrased in my previous comment. – GLaDER Feb 15 '18 at 08:00

0 Answers0