keras - get probability per each class

Question

I'm trying to get the probability per each class out of the keras model. Please find sample keras model below:

width = 80
height = 80
model = Sequential()
model.add(Conv2D(32, (3, 3), input_shape=( width, height, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Conv2D(32, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Conv2D(64, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Flatten())  # this converts our 3D feature maps to 1D feature vectors
model.add(Dense(64))
model.add(Activation('relu'))
#model.add(Dropout(0.5))
model.add(Dense(2))
model.add(Activation('softmax'))

model.compile(loss='sparse_categorical_crossentropy',
              optimizer='rmsprop',
              metrics=['accuracy'])

However, after the model is trained, and I load an image to be predicted via:

img = image.load_img('Test2.jpg', target_size=(80, 80))
x = image.img_to_array(img)
x = np.expand_dims(x, axis=0)
images = np.vstack([x])
classes = model.predict_proba(images, batch_size=1)
print(classes)

[[ 0.  1.]]

I still get the classes labels, rather than probabilities. Any hints what am I doing wrong?

EDIT This is how the model is trained:

train_datagen = ImageDataGenerator(
        rotation_range=40,
        width_shift_range=0.2,
        height_shift_range=0.2,
        shear_range=0.2,
        zoom_range=0.2,
        horizontal_flip=True,
        fill_mode='nearest')

# this is the augmentation configuration we will use for testing:
# only rescaling
test_datagen = ImageDataGenerator(
        rotation_range=40,
        width_shift_range=0.2,
        height_shift_range=0.2,
        shear_range=0.2,
        zoom_range=0.2,
        horizontal_flip=True,
        fill_mode='nearest')


train_generator = train_datagen.flow_from_directory(
        '.\\train',  # this is the target directory
        target_size=(width, height),  # all images will be resized to 150x150
        batch_size=batch_size,
        class_mode='binary',
        shuffle=True)  # since we use binary_crossentropy loss, we need binary labels

# this is a similar generator, for validation data
validation_generator = test_datagen.flow_from_directory(
        '.\\validate',
        target_size=(width, height),
        batch_size=batch_size,
        class_mode='binary',
        shuffle=True)

model.fit_generator(
        train_generator,
        steps_per_epoch=4000,
        epochs=2,
        validation_data=validation_generator,
        validation_steps=1600)

Have you normalized your training data? Have you normalized your input image accordingly? — Marcin Możejko, Jan 11 '18 at 23:20
Hey guys, thanks for prompt reactions. I updated the question with the code which trains the model / loads the samples. I don't think I'm doing normalization anywhere, or am I wrong? (keras newbie). It looks rather odd to me that the prob of either of the classes would be that high, right? However will try for different samples and let you know. — TechCrap, Jan 12 '18 at 09:56

Daniele Grattarola · Accepted Answer · 2018-01-23T15:07:06.287

1

The problem is that you are using the 'sparse_categorical_crossentropy' loss with class_mode='binary' in your ImageDataGenerator.

You have two possibilities here:

Change the loss to 'categorical_crossentropy' and set class_mode='categorical'.
Leave the loss as is but set class_mode='sparse'.

Either will work.

Refer to this answer for the difference between the two losses (in Tensorflow, but it holds for Keras too). The short version is that the sparse loss expects labels to be integer classes (e.g. 1, 2, 3...), whereas the normal one wants one-hot encoded vectors (e.g. [0, 1, 0, 0]).

Cheers

EDIT: as @Simeon Kredatus pointed out, it was a normalization issue. This can be easily solved by setting the appropriate flags in the ImageDataGenerator constructors for both training and test sets, namely samplewise_center=True and samplewise_std_normalization=True.
Updating the answer so people can see the solution. In general, remember the trash-in-trash-out principle.

edited Jan 23 '18 at 15:07

answered Jan 12 '18 at 10:10

Daniele Grattarola

1,517
17
25

Hey, thanks for the reply. Now I'm getting following error: Error when checking target: expected activation_5 to have shape (None, 2) but got array with shape (8, 1). Can this be related to the class_mode of train / validation generators? – TechCrap Jan 12 '18 at 11:32
Yes, I'm sorry, I hand't checked how ImageDataGenerator works. You can either use 'categorical_crossentropy' and class_mode='categorical' or 'sparse_categorical_crossentropy' and class_mode='sparse'. I'll edit the answer to match this new info :) – Daniele Grattarola Jan 12 '18 at 14:22
Either of the above mentioned solutions still gives me array like: [[1. 0.]] -> is it really possible the probs would be either 1 or 0 for the respective classes? – TechCrap Jan 12 '18 at 15:23
It's possible, sure, just really weird unless you have the perfect dataset. Can you try to run the same prediction before training and post the model's output? Also how is the dataset built? – Daniele Grattarola Jan 13 '18 at 19:30
Hey, just to wrap it up - in each of the ImageDataGenerator instances I had to include the normlization - "samplewise_center=True, samplewise_std_normalization=True" - I had to include it in the constructor of each of the ImageDataGenerator objects.The thing is without having the data normalized it tends to mistaken the learning procedure (yes, now it's obvious :) ). So the correct answer is two folded: You answered the first part and @Marcin Mozejko had a point as well :). You can update your answer if you'd like to, will accept it as a correct one. – TechCrap Jan 22 '18 at 19:06
Good to know, I updated the answer so everyone can see. Cheers :) – Daniele Grattarola Jan 23 '18 at 15:07

keras - get probability per each class

1 Answers1

Linked