8

I'm training an LSTM model using as input a sequence of 50 steps of 3 different features laid out as below:

#x_train
[[[a0,b0,c0],.....[a49,b49,c49]],
  [a1,b1,c1]......[a50,b50,c50]],
  ...
  [a49,b49,c49]...[a99,b99,c99]]]

Using the following dependent variable

#y_train
[a50, a51, a52, ... a99]

The code below works to predict just a, how do I get it to predict and return a vector of [a,b,c] at a given timestep?

def build_model():
model = Sequential()

model.add(LSTM(
    input_shape=(50,3),
    return_sequences=True, units=50))
model.add(Dropout(0.2))

model.add(LSTM(
    250,
    return_sequences=False))
model.add(Dropout(0.2))

model.add(Dense(1))
model.add(Activation("linear"))

model.compile(loss="mse", optimizer="rmsprop")
return model
James Li
  • 85
  • 1
  • 1
  • 3
  • Have you tried altering the network to have the labels be [[a50, b50, c50], [a51, b51, c51], ... [a99, b99, c99]]? – txizzle Sep 07 '17 at 18:32
  • txizzle, i'm not sure what you mean, ax, bx, cx are just placeholders i used for discrete timeseries data points. – James Li Sep 07 '17 at 19:10

1 Answers1

15

The output of every layer is based on how many cells/units/filters it has.

Your output has 1 feature because Dense(1...) has only one cell.

Just making it a Dense(3...) would solve your problem.


Now, if you want the output to have the same number of time steps as the input, then you need to turn on return_sequences = True in all your LSTM layers.

The output of an LSTM is:

  • (Batch size, units) - with return_sequences=False
  • (Batch size, time steps, units) - with return_sequences=True

Then you use a TimeDistributed layer wrapper in your following layers to work as if they also had time steps (it will basically preserve the dimension in the middle).

def build_model():
    model = Sequential()

    model.add(LSTM(
        input_shape=(50,3),
        return_sequences=True, units=50))
    model.add(Dropout(0.2))

    model.add(LSTM(
        250,
        return_sequences=True))
    model.add(Dropout(0.2))

    model.add(TimeDistributed(Dense(3)))
    model.add(Activation("linear"))

    model.compile(loss="mse", optimizer="rmsprop")
    return model
Daniel Möller
  • 84,878
  • 18
  • 192
  • 214
  • Awesome, that did it! Curious about the TimeDistributed wrapper, what would be an usecase for returning the full set of time steps in the final output? wouldn't that be the same as just running predict on the input data set? – James Li Sep 07 '17 at 19:39
  • 1
    It all depends on what you want to do. Imagine a case where you are trying to judge what a person would feel throughout a movie. Each movie frame would be a time step, and you want to classify all steps to have a time distrubuted evolution of a person's feeling. But you could never guess a feeling from a single frame individually. That would be the role of the LSTM layer, analyse each time step and keep track of what is going on. – Daniel Möller Sep 07 '17 at 19:44
  • Is there a way to produce outputs bigger than `return_sequences`? In other words I would predict the features at multiple timesteps ahead. Is it possible with a keras model written as in your answer? Or it can just return at most an output which has the size of the window of timesteps employed in the recurrent structure? – AleB May 13 '19 at 20:41
  • You can take a look at the examples here with `stateful = True` for predicting the future: https://stackoverflow.com/questions/38714959/understanding-keras-lstms/50235563#50235563 – Daniel Möller May 14 '19 at 12:52