Training a KerasRegressor with LSTM using all sequence outputs and a whole series of targets

Question

I've got a problem where I'd like to train on multiple time series. I have a multi-dimensional input and a univariate output. The feature I'm trying to predict is not included in the input, so I want features[:now] -> LSTM -> target[now]. I want this for all values of now in the series. My series are of varying lengths.

I gather from Jason Brownlee's posts that input is given as a tensor of shape (n_samples, n_timesteps, n_features), and I've remembered/found that I can deal with different lengths of series by passing None for n_timesteps in the model's input_shape. Great.

But I don't want to have to grab (features[:i], target[i]) for all i and call fit over and over and over again, when the network is perfectly primed to take (features[i+1], target[i+1]) and find gradients right after i.

How can I use all my targets efficiently, without having to reset state and refit to tons of different views of my data? Why is the Keras documentation so bad that it doesn't even specify the dimensionality x and y are supposed to take?

Aside: Obviously I want to reset the state between series while training. I've managed to figure out that if examples are passed as a batch, then [`n_samples` different states are held in the LSTM](https://stackoverflow.com/questions/43882796/when-does-keras-reset-an-lstm-state), so they don't interfere. But I don't think I can create a tensor with all my samples without padding, since that becomes one tensor, and my sequences are of variable lengths. But I'm okay with calling `fit()` on each individually, with a batch size of 1. — Pavel Komarov, Jul 22 '21 at 20:00
Do I need to make my LSTM layer stateful and pass each sequence one point at a time? I can imagine setting up my network to take input of shape `(n_timesteps=1, n_features)` and calling `fit(features[i], target[i])` for each `i`, then resetting state manually at the end. Just seems like there should be a way to do this inside the library call. — Pavel Komarov, Jul 22 '21 at 20:24
I found what the output dimension is. It's `(n_samples, n_units)`, where the number of units is the number of LSTM nodes in the layer. https://www.youtube.com/watch?v=CcGf_Uo7NMw — Pavel Komarov, Jul 22 '21 at 21:13
It looks like [return_sequences=True](https://www.tensorflow.org/api_docs/python/tf/compat/v1/keras/layers/LSTM) would cause the layer to return the thing I want to be comparing my targets against, but I'm not sure what to do from there. ["You may also need to access the sequence of hidden state outputs when predicting a sequence of outputs with a Dense output layer wrapped in a TimeDistributed layer."](https://machinelearningmastery.com/return-sequences-and-return-states-for-lstms-in-keras/) That's intriguing. — Pavel Komarov, Jul 22 '21 at 21:22

Pavel Komarov · Answer 1 · 2021-07-27T17:52:03.763

I think I may have figured it out. I essentially want to do many-to-many prediction.

# create LSTM
n_samples = 1 # batches of one sequence at a time
n_timesteps = None # sequences of variable length
n_features = X_train.shape[1] # the width of the input matrix
n_neurons = 10 # number of LSTM nodes

def make_lstm():
    model = Sequential()
    model.add(LSTM(n_neurons, input_shape=(n_timesteps, n_features), return_sequences=True))
    model.add(TimeDistributed(Dense(1))) # single output per timestep
    model.compile(loss='mean_squared_error', optimizer='adam')
    print(model.summary())
    return model

estimator = KerasRegressor(build_fn=make_lstm, epochs=100,
                           batch_size=1, verbose=True)

# reshape data to (1, trial_length, n_features) before passing to the network
# output comes back of shape (1, trial_length)

Training a KerasRegressor with LSTM using all sequence outputs and a whole series of targets

1 Answers1