Keras Stateful LSTM inference on Flask

Question

I am running a service where I run inference on a stateful LSTM using Keras. However I am wondering what the threading semantics are here. I am not asking how to store models per flask session, I am more interested in what is happening under the hood in Keras when I run my stateful models, for example is making a new model per thread overkill? Does Keras handle states per thread automatically?

This question is different from the stated duplicate because the duplicate deals explicitly with storing objects per flask session. This question deals with how Keras deals with stateful models between threads.

My code for inference is like so:

Loading:

MODEL = model_from_json(
    open(f"{ROOT_DIR}/../../bias-model/bias-model.json", "r").read()
)
MODEL.load_weights(f"{ROOT_DIR}/../../bias-model/bias-model.h5")

Inference:

for i in range(batch_input.shape[0]):
    prediction = MODEL.predict_on_batch(batch_input[i])[-1][0]
MODEL.reset_states()

Because I reset states on the model, does this mean I should create a new model per thread, or perhaps perform a lock when I run the prediction on the bias model that is created globally, or maybe there is some other mechanism that I am missing?

I should add that I running TF 2.3.1 on Keras 2.4.3. Often when I research solutions they are not compatible with these versions.

nice job on the edit. Now focus on the differences compared against the mentioned duplicate. If they are such different then state that explicit in your question. It will eventually remove "duplicate" label from your question and enable posting answers again. If not it maintains as is. — ZF007, Nov 23 '20 at 17:25
Edited to state the differences between the duplicate question and this question. — user293895, Nov 23 '20 at 17:35
..then its time to wait and see ;-) Flag my comments for removals if/once the embargo is lifted. — ZF007, Nov 23 '20 at 18:07

Keras Stateful LSTM inference on Flask

0 Answers0