Not understanding model.predict_classes with generator

Question

My question is related to this post and I have also tried to implement solution from here and here. Down below I also give my try of implementing the code according to these solutions (my implementation/output is not correct). My code is as follows, using the paper, rock, scissors data:

!wget --no-check-certificate \
    https://storage.googleapis.com/laurencemoroney-blog.appspot.com/rps.zip \
    -O /tmp/rps.zip
  
!wget --no-check-certificate \
    https://storage.googleapis.com/laurencemoroney-blog.appspot.com/rps-test-set.zip \
    -O /tmp/rps-test-set.zip

import os
import zipfile

local_zip = '/tmp/rps.zip'
zip_ref = zipfile.ZipFile(local_zip, 'r')
zip_ref.extractall('/tmp/')
zip_ref.close()

local_zip = '/tmp/rps-test-set.zip'
zip_ref = zipfile.ZipFile(local_zip, 'r')
zip_ref.extractall('/tmp/')
zip_ref.close()

rock_dir = os.path.join('/tmp/rps/rock')
paper_dir = os.path.join('/tmp/rps/paper')
scissors_dir = os.path.join('/tmp/rps/scissors')

print('total training rock images:', len(os.listdir(rock_dir)))
print('total training paper images:', len(os.listdir(paper_dir)))
print('total training scissors images:', len(os.listdir(scissors_dir)))

rock_files = os.listdir(rock_dir)
print(rock_files[:10])

paper_files = os.listdir(paper_dir)
print(paper_files[:10])

scissors_files = os.listdir(scissors_dir)
print(scissors_files[:10])

import tensorflow as tf
import keras_preprocessing
from keras_preprocessing import image
from keras_preprocessing.image import ImageDataGenerator

TRAINING_DIR = "/tmp/rps/"
training_datagen = ImageDataGenerator(
      rescale = 1./255,
        rotation_range=40,
      width_shift_range=0.2,
      height_shift_range=0.2,
      shear_range=0.2,
      zoom_range=0.2,
      horizontal_flip=True,
      fill_mode='nearest')

VALIDATION_DIR = "/tmp/rps-test-set/"
validation_datagen = ImageDataGenerator(rescale = 1./255)

train_generator = training_datagen.flow_from_directory(
    TRAINING_DIR,
    target_size=(150,150),
    class_mode='categorical',
  batch_size=126
)

validation_generator = validation_datagen.flow_from_directory(
    VALIDATION_DIR,
    target_size=(150,150),
    class_mode='categorical',
  batch_size=126
)

model = tf.keras.models.Sequential([
    # Note the input shape is the desired size of the image 150x150 with 3 bytes color
    # This is the first convolution
    tf.keras.layers.Conv2D(64, (3,3), activation='relu', input_shape=(150, 150, 3)),
    tf.keras.layers.MaxPooling2D(2, 2),
    # The second convolution
    tf.keras.layers.Conv2D(64, (3,3), activation='relu'),
    tf.keras.layers.MaxPooling2D(2,2),
    # The third convolution
    tf.keras.layers.Conv2D(128, (3,3), activation='relu'),
    tf.keras.layers.MaxPooling2D(2,2),
    # The fourth convolution
    tf.keras.layers.Conv2D(128, (3,3), activation='relu'),
    tf.keras.layers.MaxPooling2D(2,2),
    # Flatten the results to feed into a DNN
    tf.keras.layers.Flatten(),
    tf.keras.layers.Dropout(0.5),
    # 512 neuron hidden layer
    tf.keras.layers.Dense(512, activation='relu'),
    tf.keras.layers.Dense(3, activation='softmax')
])


model.summary()

model.compile(loss = 'categorical_crossentropy', optimizer='rmsprop', metrics=[tf.keras.metrics.Recall()])

history = model.fit(train_generator, epochs=25, steps_per_epoch=20, validation_data = validation_generator, verbose = 1, validation_steps=3)

model.save("rps.h5")

The output shows a model with quite high fit.I now want to test this again complete new data: (Please note, unfortunately, that this is named "validation" data)

import shutil
import glob
import numpy as np
import os as os

!wget --no-check-certificate \
    https://storage.googleapis.com/laurencemoroney-blog.appspot.com/rps-validation.zip \
    -O /tmp/rps-validation-set.zip

local_zip = '/tmp/rps-validation-set.zip'
zip_ref = zipfile.ZipFile(local_zip, 'r')
zip_ref.extractall('/tmp/rps-validation-set')
zip_ref.close()

os.mkdir("/tmp/rps-validation-set/paper/")
os.mkdir("/tmp/rps-validation-set/rock/")
os.mkdir("/tmp/rps-validation-set/scissors/")

dest_dir = "/tmp/rps-validation-set/paper"
for file in glob.glob(r'/tmp/rps-validation-set/paper*.png'):
    print(file)
    shutil.move(file, dest_dir)

dest_dir = "/tmp/rps-validation-set/rock"
for file in glob.glob(r'/tmp/rps-validation-set/rock*.png'):
    print(file)
    shutil.move(file, dest_dir)

dest_dir = "/tmp/rps-validation-set/scissors"
for file in glob.glob(r'/tmp/rps-validation-set/scissors*.png'):
    print(file)
    shutil.move(file, dest_dir)

!rm -r /tmp/rps-validation-set/.ipynb_checkpoints

new_DIR = "/tmp/rps-validation-set/"
new_datagen = ImageDataGenerator(rescale = 1./255)

new_generator = new_datagen.flow_from_directory(
    new_DIR,
    target_size=(150,150),
    class_mode='categorical',
  batch_size=126
)

print(new_generator.class_indices)
print(new_generator.classes)
print(new_generator.num_classes)

print(train_generator.class_indices)
print(train_generator.classes)
print(train_generator.num_classes)

model.evaluate(new_generator)

classes = model.predict(new_generator)
model.predict(new_generator)
np.argmax(model.predict(new_generator), axis=-1)
print(classes)

# output from here on:
print("model evaluate output ", model.evaluate(new_generator))
print("train_generator classes: ", train_generator.classes)
print("new_generator classes:   ", new_generator.classes)
print("train_generator class indices:   ",train_generator.class_indices)
print("new_generator class indices:     ",new_generator.class_indices)
print("model prediction ", model.predict_classes(new_generator))
print("actual values/labels    ", new_generator.labels)
print("filenames ", new_generator.filenames)

print("\n manually predict first 4 single paper images: \n ")

path = "/tmp/rps-validation-set/paper/paper-hires1.png"
img = image.load_img(path, target_size=(150, 150))
x = image.img_to_array(img)
x = np.expand_dims(x, axis=0)

images = np.vstack([x])
classes = model.predict(images, batch_size=10)
print(path)
print(classes)

path = "/tmp/rps-validation-set/paper/paper-hires2.png"
img = image.load_img(path, target_size=(150, 150))
x = image.img_to_array(img)
x = np.expand_dims(x, axis=0)

images = np.vstack([x])
classes = model.predict(images, batch_size=10)
print(path)
print(classes)

path = "/tmp/rps-validation-set/paper/paper1.png"
img = image.load_img(path, target_size=(150, 150))
x = image.img_to_array(img)
x = np.expand_dims(x, axis=0)

images = np.vstack([x])
classes = model.predict(images, batch_size=10)
print(path)
print(classes)

path = "/tmp/rps-validation-set/paper/paper2.png"
img = image.load_img(path, target_size=(150, 150))
x = image.img_to_array(img)
x = np.expand_dims(x, axis=0)

images = np.vstack([x])
classes = model.predict(images, batch_size=10)
print(path)
print(classes)

print("\n trying different solution \n")
predictions = model.predict(new_generator)       
predictions = np.argmax(predictions, axis=-1) #multiple categories
label_map = (train_generator.class_indices)
label_map = dict((v,k) for k,v in label_map.items()) #flip k,v
predictions = [label_map[k] for k in predictions]

print("predictions adjusted ", predictions)
print("actual values    ", new_generator.labels)

The relevant output for my problem is as follows (from # output from here on: on):

First I check the performance with model.evaluate, the high recall and low loss shows me that the predictions are (almost) perfect so I would expect no difference between original values/labels and predicted class.

I now want to check/visualize it with showing the predictions against the actual values/labels of the input.

I am not understanding the following:

1.) Each time I run print("model prediction ", model.predict_classes(new_generator)), so model.predict_classes the output is a different one:

Why? I have a model which is fixed and I plug in some values so I would expect that the predictions are stable. Same holds for print(model.predict(new_generator)), so every time I run it, different output.

2.) The predictions do not match the actual values. I do not understand why and how I can achieve what I want, to match the predictions with the corresponding values and check where the differences are. I thought that maybe the order is a different one and indeed in this post it does mention it and two solutions are provided. The solution with removing the rescaling in the imagegenerator is not a good solution in my eyes. I tried to adjust the following proposed solution:

import numpy as np
predictions = model.predict_generator(self.test_generator)
predictions = np.argmax(predictions, axis=-1) #multiple categories

label_map = (train_generator.class_indices)
label_map = dict((v,k) for k,v in label_map.items()) #flip k,v
predictions = [label_map[k] for k in predictions]

to my code (see "trying different solution" code block):

Output:

But this is wrong / my implementation is wrong.

I also tried the following from this thread:

from glob import glob
class_names = glob("/tmp/rps-validation-set/*") # Reads all the folders in which images are present
class_names = sorted(class_names) # Sorting them
name_id_map = dict(zip(class_names, range(len(class_names))))

But this is not helping.

When I manually predict the first 4 images, the output from model.predict is different, it is correct. [1. 0. 0] is the correct output for all the 4 images. So when I run model.predict / model.predict_classes on a single image / giving the filename it is correct. However, when I run it on a image data generator it is somehow shuffled?

3.) What I am not understanding is the difference between model.predict_classes and model.predict, these two:

print(model.predict_classes(new_generator))
print(model.predict(new_generator))

The predicted probabilities do not match the predicted classes. For example already for the first entry, the largest probability is 9.99e-01, however predicted class is 1. Then for the second entry is does match, so again largest probability is 9.99e-01 and this corresponds to the last class and predicted class is indeed 2. So it seems everything is completely shuffled. When I think about the first images belonging to the first class, so [1 0 0] is correct then I would expect that the higher probability is in the first class (corresponding to the value 0) (it is not) and that predicted class is the first class (corresponding to the value 0) (it is not).

4.) When I run

images = np.vstack([x])
classes = model.predict(images, batch_size=10)
classes2 = model.predict_classes(images, batch_size=10)
print(path)
print(classes)
print(classes2)

I get

How do I get the probabilities? So something like

mujjiga · Accepted Answer · 2020-07-12T20:39:56.587

3

You did every thing correct except one. When you are creating the data generator for testing (new_generator) you are not setting the shuffle=False. This way you can't reproduce the results with each run as the data generator will shuffle the data for each iteration.

FIX

new_generator = new_datagen.flow_from_directory(
    new_DIR,
    target_size=(150,150),
    class_mode='categorical',
  batch_size=126,
  shuffle=False
)

And everything will work fine with your existing code.

Question 0:

The evaluate method Docs

Returns the loss value & metrics values for the model in test mode.

print("model evaluate output ", model.evaluate(new_generator))

you will see two scalars first one is loss and the second one is recall

Question 1:

With shuffle=False you will get reproducible results with each run.

Question 2, 3:

Again With shuffle=False will fix the issue. The correct way to finding class index from probability score is as below:

print (np.argmax(model.predict(new_generator), axis=1))

Which you can verify will be same as

print (model.predict_classes(new_generator))

Question 4:

model.predict will give you the class probability
model.predict_classes will give you the class label based on the highest probability class.

Only train data and validation data have to be shuffled for randomization. You should not shuffle the test data (for which you don't know or assume to not know) the ground truth.

Making predictions

The data-augmentations done in the train data generator are not required during predictions as these augmentation are done at random to generate varying train data. You can use the below code to make predictions, assuming that all the images for which you want to make predictions are inside /tmp/rps-validation-set/ (as images not subfolder since you will not know the classes of these images in general)

test_datagen = ImageDataGenerator(rescale = 1./255)
       
test_generator = test_datagen.flow_from_directory(
    "/tmp/rps-validation-set/",
    target_size=(150,150),
    class_mode=None,
  batch_size=126,
  shuffle=False
)

model.predict(test_generator)

Note that since we apply back the fixed image scaling transform which you have applied for training and validation, but skip the other random image augmentations.

edited Jul 12 '20 at 20:39

answered Jul 12 '20 at 12:32

mujjiga

16,186
2
33
51

@mujiiga Thanks a lot for your very good answer! Regarding question 4: model.predict does not give me the class probabilities, when I run: print(model.predict(images, batch_size=10)) I do not get probabilities, I get as output: [[1. 0. 0.]], but I want something like this: [1.14104751e-07, 1.68795977e-09, 9.99999881e-01] Update: Mh, I guess maybe this is rounded? Could it be? So how can I get the exact values, not rounded? – Stat Tistician Jul 12 '20 at 19:58
How can I load single images correctly applying the image transformations? – Stat Tistician Jul 12 '20 at 20:09
1

@StatTistician `predict` will return the probabilities. If the model is very confident then the prababilty of the true class will be very near to 1 but due to limitations of numerical precision you will see it as 1 (or 0 in other extreme). – mujjiga Jul 12 '20 at 20:12
Thanks for the fast response, one further question regarding the prediction of single images and applying transformations or not: Here in the original code:https://github.com/lmoroney/dlaicourse/blob/09767115b53c8ac99df7b85fc67a3cea9fed2482/Course%202%20-%20Part%208%20-%20Lesson%202%20-%20Notebook%20(RockPaperScissors).ipynb It is done the same way, so a single image is passed to predict, without applying the transformations? – Stat Tistician Jul 12 '20 at 20:25
@mujiiga No, I think it is correct to NOT apply image augmentation to test data. So this is only done to make training more difficult, it is not the case that image augmentation is also applied to validation or test data. So when I created my validation imagedatagenerator I did not apply the image augmentation, only to train data, but not to validation or test data. So there I also did not apply any image augmentation? – Stat Tistician Jul 12 '20 at 20:32
So "You will have to apply the same image transforms on the images before making predictions on them." I think here on this point you are wrong. – Stat Tistician Jul 12 '20 at 20:36
1

@StatTistician I stand corrected :) I miss-took it for standard image transforms but they are indeed image augmentations for generating different images except `rescale`. I have updated the answer. – mujjiga Jul 12 '20 at 20:41
@mujiiga Once again big thanks and perfect. However, please note that your solution now is even more complicated than necessary, as my simple solution with image.load_img(path, target_size=(150, 150)) does all what is needed. So no imagedatagenerator with batch and rescale factor needed when just checking single images. To do it for all images in my test folder and then to evaluate the complete performance I already had the new_generator before with new_DIR = "/tmp/rps-validation-set/". So for single images this way is not needed. – Stat Tistician Jul 12 '20 at 20:43
1

Yes, but say you have 100 images to make predictions you will end up writing a loop to collect those images into a batch and then run the predict over the batch. It is slow to make a prediction on one image at a time. – mujjiga Jul 12 '20 at 20:47
Well, I did apply it to all the images in the folder with the new_generator and here I just wanted to check some single images "for fun". So your code was already implemented, check my new_generator. Here at the end I just wanted to manually do it for some examples. – Stat Tistician Jul 12 '20 at 20:47
One further question, why did you set class_mode=None? I looked it up, but did not understand the idea behind this. So I used class_mode='categorical'. – Stat Tistician Jul 12 '20 at 21:12
1

`None: no targets are returned (the generator will only yield batches of image data, which is useful to use in model.predict_generator())`, from docs https://keras.io/api/preprocessing/image/. It really does not matter in this case. – mujjiga Jul 12 '20 at 21:14