I tried to get an estimate of the prediction time of my keras model and realised something strange. Apart from being fairly fast normally, every once in a while the model needs quite long to come up with a prediction. And not only that, those times also increase the longer the model runs. I added a minimal working example to reproduce the error.
import time
import numpy as np
from sklearn.datasets import make_classification
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Flatten
# Make a dummy classification problem
X, y = make_classification()
# Make a dummy model
model = Sequential()
model.add(Dense(10, activation='relu',name='input',input_shape=(X.shape[1],)))
model.add(Dense(2, activation='softmax',name='predictions'))
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
model.fit(X, y, verbose=0, batch_size=20, epochs=100)
for i in range(1000):
# Pick a random sample
sample = np.expand_dims(X[np.random.randint(99), :], axis=0)
# Record the prediction time 10x and then take the average
start = time.time()
for j in range(10):
y_pred = model.predict_classes(sample)
end = time.time()
print('%d, %0.7f' % (i, (end-start)/10))
The time does not depend on the sample (it is being picked randomly). If the test is repeated, the indices in the for loop where the prediction takes longer are going to be (nearly) the same again.
I'm using:
tensorflow 2.0.0
python 3.7.4
For my application I need to guarantee the execution in a certain time. This is however impossible considering that behaviour. What is going wrong? Is it a bug in Keras or a bug in the tensorflow backend?
EDIT:
predict_on_batch
shows the same behavior, however, more sparse:
y_pred = model(sample, training=False).numpy()
shows some heavy outliers as well, however, they are not increasing.
EDIT 2:
I downgraded to the latest tensorflow 1 version (1.15). Not only is the problem not existent anymore, also the "normal" prediction time significantly improved! I do not see the two spikes as problematic, as they didn't appear when I repeated the test (at least not at the same indices and linearly increasing) and are percentual not as large as in the first plot.
We can thus conclude that this seems to be a problem inherent to tensorflow 2.0, which shows similar behaviour in other situations as @OverLordGoldDragon mentions.