Its my first time deploying a model. I've created a cnn model using tensorflow, keras, Xception and saved model is about 80 mb. When I load it from a function and do a prediction, it takes about 4-5 seconds. Is there a way to reduce this time? Does the model has to be loaded for every prediction?
Asked
Active
Viewed 79 times
0
-
What do you mean by *Does the model has to be loaded for every prediction?* Also you can try to use `model(x)` instead of `model.predict(x)`. – Frightera Mar 01 '21 at 08:14
-
I added my function as an image. I'll look into model(x) – sonam agarwal Mar 01 '21 at 08:39
-
Thanks, model(x) reduced prediction time. – sonam agarwal Mar 01 '21 at 10:05
1 Answers
0
The model load only once in your program. for each prediction, you use the loaded model. it might take time to predict. TensorFlow doesn't load the model on prediction. the better way is to only save weights after training and for inference create model architecture and then load the saved weights.

Sadegh Ranjbar
- 196
- 8
-
I've added an image of my func. model=load_model('model.h5') is inside my func. I have to deploy multiple models on a website, So there is another py file which imports this function. How do I go about it? – sonam agarwal Mar 01 '21 at 08:42
-
the best practice is to write a singletone class to load and predict on the model and load the model in the constructor of your class. singleton class only initialize once so your model just load once in the startup of your application. This question is helpful : https://stackoverflow.com/questions/6760685/creating-a-singleton-in-python – Sadegh Ranjbar Mar 01 '21 at 09:18
-