CNN model for deployment: how to optimize

Question

Its my first time deploying a model. I've created a cnn model using tensorflow, keras, Xception and saved model is about 80 mb. When I load it from a function and do a prediction, it takes about 4-5 seconds. Is there a way to reduce this time? Does the model has to be loaded for every prediction?

enter image description here

What do you mean by *Does the model has to be loaded for every prediction?* Also you can try to use `model(x)` instead of `model.predict(x)`. — Frightera, Mar 01 '21 at 08:14

score 0 · Answer 1 · answered Mar 01 '21 at 08:36

0

The model load only once in your program. for each prediction, you use the loaded model. it might take time to predict. TensorFlow doesn't load the model on prediction. the better way is to only save weights after training and for inference create model architecture and then load the saved weights.

answered Mar 01 '21 at 08:36

Sadegh Ranjbar

196
8

I've added an image of my func. model=load_model('model.h5') is inside my func. I have to deploy multiple models on a website, So there is another py file which imports this function. How do I go about it? – sonam agarwal Mar 01 '21 at 08:42
the best practice is to write a singletone class to load and predict on the model and load the model in the constructor of your class. singleton class only initialize once so your model just load once in the startup of your application. This question is helpful : https://stackoverflow.com/questions/6760685/creating-a-singleton-in-python – Sadegh Ranjbar Mar 01 '21 at 09:18
Thanks, I'll try that. – sonam agarwal Mar 01 '21 at 10:06

CNN model for deployment: how to optimize

1 Answers1