Keras - Restore LSTM hidden state for a specific time stamp

Question

This question is in continue to (LSTM - Making predictions on partial sequence). As described in the previous question I've trained a stateful LSTM model for binary classification with batches of 100 samples/labels like so:

[Feature 1,Feature 2, .... ,Feature 3][Label 1]
[Feature 1,Feature 2, .... ,Feature 3][Label 2]
...
[Feature 1,Feature 2, .... ,Feature 3][Label 100]

Model Code:

def build_model(num_samples, num_features, is_training):
    model = Sequential()
    opt = optimizers.Adam(lr=0.0005, beta_1=0.9, beta_2=0.999, epsilon=1e-08, decay=0.0001)

    batch_size = None if is_training else 1
    stateful = False if is_training else True
    first_lstm = LSTM(32, batch_input_shape=(batch_size, num_samples, num_features),  return_sequences=True,
                      activation='tanh', stateful=stateful)

    model.add(first_lstm)
    model.add(LeakyReLU())
    model.add(Dropout(0.2))
    model.add(LSTM(16, return_sequences=True, activation='tanh', stateful=stateful))
    model.add(Dropout(0.2))
    model.add(LeakyReLU())
    model.add(LSTM(8, return_sequences=True, activation='tanh', stateful=stateful))
    model.add(LeakyReLU())
    model.add(Dense(1, activation='sigmoid'))

    if is_training:
        model.compile(loss='binary_crossentropy', optimizer=opt,
                      metrics=['accuracy', f1])
    return model

When predicting, the model is stateless, batch size is 1 and the classification probability is retrieved after each sample like so:

[Feature 1,Feature 2, .... ,Feature 10][Label 1] -> (model) -> probability

calling model.reset_states() after the model finished processing a batch of 100 samples. The model works and the results are great.

Note: My data are events coming from multiple sources.

My Problem:

When I test my model I have control over the order of the samples and I can make sure that the samples arrive from the same source. i.e all first 100 samples are from source 1, then after calling model.reset_states() the next 100 samples are from source 2 and so on.

On my production environment, however, the samples arrives in an async way for example:

First 3 samples from source 1 then 2 samples from source 2 etc'

Ilustration:

My Question:

How can I serialize the model state at a certain timestamp for each source, so I can save it after each sample then load it back when a new sample arrives from the same source.

What about saving the whole model for each source separately and loading back the corresponding model for source X when you want to continue processing the remaining data from source X? Especially if the model size is not large, this might work well in terms of performance. — today, Feb 24 '19 at 13:31
Thanks for the reply, I'm not sure I get it. The model weights are always the same during prediction, furthermore, I use the same model for all the sources. What I need to save is the state of the model while in the middle of a sequence. — Shlomi Schwartz, Feb 24 '19 at 13:53
When you save the model object, its state will be stored as well. However, it might make more sense to only get the state and only store that, as suggested in @RoniGadot 's answer. — today, Feb 24 '19 at 19:39

score 1 · Answer 1 · answered Feb 24 '19 at 15:09

You can get and set the internal states like so:

import keras.backend as K

def get_states(model):
    return [K.get_value(s) for s,_ in model.state_updates]

def set_states(model, states):
    for (d,_), s in zip(model.state_updates, states):
        K.set_value(d, s)

Keras - Restore LSTM hidden state for a specific time stamp

1 Answers1

Linked