I'm somewhat confused about what I'm seeing with my pretrained keras models. I'm using a virtualenv with tensorflow-gpu=1.13.1 installed via pip install tensorflow-gpu
. Here's a minimal working example you can run, based on the keras documentation (hopefully these are updated). In addition, I got the elephant image from here and saved it as elephant.jpeg
.
from tensorflow.keras.applications.resnet50 import ResNet50
from tensorflow.keras.preprocessing import image
from tensorflow.keras.applications.resnet50 import preprocess_input, decode_predictions
import numpy as np
# Load the image.
img_path = 'data/elephant.jpeg'
img = image.load_img(img_path, target_size=(224, 224))
x = image.img_to_array(img)
x = np.expand_dims(x, axis=0) # shape (1,224,224,3)
x = preprocess_input(x)
# The basic full model
model = ResNet50(weights='imagenet')
# Make a session here
sess = tf.Session()
sess.graph.finalize()
# Predict, and decode the results into a list of tuples (class, description,
# probability) (one such list for each sample in the batch)
preds = model.predict(x)
print('Predicted:', decode_predictions(preds, top=4)[0])
Running the code will result in:
RuntimeError: Graph is finalized and cannot be modified.
Yet the weird thing is that if I change the code to insert a second model.predict
before I finalize the graph, as in here:
from tensorflow.keras.applications.resnet50 import ResNet50
from tensorflow.keras.preprocessing import image
from tensorflow.keras.applications.resnet50 import preprocess_input, decode_predictions
import numpy as np
# Load the image.
img_path = 'data/elephant.jpeg'
img = image.load_img(img_path, target_size=(224, 224))
x = image.img_to_array(img)
x = np.expand_dims(x, axis=0) # shape (1,224,224,3)
x = preprocess_input(x)
# The basic full model
model = ResNet50(weights='imagenet')
preds = model.predict(x)
print('Predicted:', decode_predictions(preds, top=4)[0])
# Make a session here
sess = tf.Session()
sess.graph.finalize()
# Predict, and decode the results into a list of tuples (class, description,
# probability) (one such list for each sample in the batch)
preds = model.predict(x)
print('Predicted:', decode_predictions(preds, top=4)[0])
In the above there are only two extra lines, I copied the prediction code above with the print statement. This output seems to work and results in both predictions working:
Predicted: [('n01871265', 'tusker', 0.5286887), ('n02504013', 'Indian_elephant', 0.4639527), ('n02504458', 'African_elephant', 0.0072972253), ('n02408429', 'water_buffalo', 2.6213302e-05)]
Predicted: [('n01871265', 'tusker', 0.5286887), ('n02504013', 'Indian_elephant', 0.4639527), ('n02504458', 'African_elephant', 0.0072972253), ('n02408429', 'water_buffalo', 2.6213302e-05)]
Here's why I'm confused and asking this question. I don't get why putting the predict code is necessary before a sess.graph.finalize()
call. I am hoping to use pretrained models solely for feature extraction. That is, I'll pass in a numpy array into the net, and get a numpy array back. (For this I'd have to use the extra include_top=True
but I didn't do that above for the sake of simplicity.) Then I want to pass this result to a new network that I design, using low-level tensorflow libraries.
It appears that before my sess.graph.finalize()
call, I need to insert a "dummy" prediction call beforehand to "get the graph set up." Is that intuition right?