Why is classifier.evalute() incorrectly identifying accuracy?

Question

Here is my code:

# importing all of the libaries necessary
import PIL.ImageFile
from keras.preprocessing import image
from keras.models import Sequential
from keras.layers import Conv2D, MaxPooling2D
from keras.layers import Dropout, Flatten, Dense
import tensorflow as tf
from keras import regularizers
from matplotlib import pyplot as plt
import numpy as np
import random
import os

PIL.ImageFile.LOAD_TRUNCATED_IMAGES = True

img_width, img_height = 512, 384
# All posible categories
categories = ["cardboard", "glass", "metal", "paper", "plastic"]
# This is the path to the dataset
train_data_dir = '/Users/lukasrois/ve/Train_Data'
test_data_dir = '/Users/lukasrois/ve/Test_Data'

classifier = Sequential()
# This is the learning rate of the model. It defines how fast the model learns.
opt = tf.keras.optimizers.Adam(lr=0.0001, clipnorm=2)


# This is the neural network!
#l2 0.001
classifier.add(Conv2D(128, (3, 3), input_shape=(128, 128, 3), activation='relu', kernel_regularizer=regularizers.l2(0.001)))

classifier.add(MaxPooling2D(pool_size=(2,2)))

classifier.add(Conv2D(64,(3,3),input_shape=(64,64,3), activation='relu', kernel_regularizer=regularizers.l2(0.001)))

classifier.add(MaxPooling2D(pool_size=(2,2)))


classifier.add(Conv2D(32,(3,3),input_shape = (32,32,3), activation= 'relu', kernel_regularizer=regularizers.l2(0.001)))

classifier.add(MaxPooling2D(pool_size=(2,2)))

classifier.add(Flatten())
classifier.add(Dropout(0.1))
classifier.add(Dense(1024, activation='relu', kernel_regularizer=regularizers.l2(0.001)))


classifier.add(Dense(5, activation='softmax'))
# The neural network needs to end in 5 possible options: cardboard, metal, glass, plastic, and paper
classifier.summary()
classifier.compile(optimizer=opt, loss='categorical_crossentropy', metrics=['accuracy'])


train_datagen = image.ImageDataGenerator(
    rescale = 1./255,
    rotation_range=90,
    width_shift_range=0.2,
    height_shift_range=0.2,
    horizontal_flip=True,
    vertical_flip=True,
    brightness_range=[0.4,1.5]

)



test_datagen = image.ImageDataGenerator(rescale=1./255)


#This makes a set for training the model
train_set = train_datagen.flow_from_directory(train_data_dir, target_size=(128,128),
                                              batch_size=16, class_mode='categorical')
#This makes a set for testing the model
test_set = test_datagen.flow_from_directory(test_data_dir, target_size=(128,128),
                                              batch_size=16, class_mode='categorical', shuffle=True)


early_stop = tf.keras.callbacks.EarlyStopping(
    monitor="accuracy",
    min_delta=0,
    patience=50,
    verbose=1,
    mode="auto",
    baseline=None,
    restore_best_weights=False,
)


#Training the model for 200 generations.
hist = classifier.fit_generator(train_set, steps_per_epoch=None,
                                epochs=2000,validation_data=test_set, shuffle=True, callbacks=early_stop)

When I run this, after my model has finished training, I check the accuracy using classifier.evaluate(test_set) this is the output:

[0.9584307074546814, 0.7704455852508545]

However, if I manually check the accuracy like this:

y_pred = classifier.predict(test_set)
acc = sum([np.argmax(test_set.classes[i])==np.argmax(y_pred[i]) for i in range(1773)])/1773
acc

I get 0.2 as in 20%

Also, I realized that when I run this

classifier.predict(test_set[0][0])

I get this:

array([[1.42874369e-05, 8.84969294e-01, 4.15825620e-02, 7.01430589e-02,
        3.29073309e-03],

Even though my final 5 neurons have a softmax activation function.

This is yet another classic mistake, generator.classes does not give you the class labels in the same order as the generator gives them, where did you find the code that suggested you to do this? — Dr. Snoopy, May 31 '22 at 19:23
https://stackoverflow.com/questions/42081257/why-binary-crossentropy-and-categorical-crossentropy-give-different-performances — Lukas Rois, May 31 '22 at 19:24
Softmax probabilites are all supposed to add up to 1. I found this here https://developers.google.com/machine-learning/crash-course/multi-class-neural-networks/softmax — Lukas Rois, May 31 '22 at 19:25
Yes, and in the numbers in your question, they do. Make the sum, and make sure you are aware of scientific notation. — Dr. Snoopy, May 31 '22 at 19:26
And I am talking about the .classes attribute of your generator, do not use it to compute any metric. — Dr. Snoopy, May 31 '22 at 19:29
OK I understand. Could you just suggest how I could calculate the accuracy metric without using classifier.evaluate()? — Lukas Rois, May 31 '22 at 19:39
Sure, with a for loop over batches in the generator, you can index it, and then you get inputs and labels, from where you can predict and compute accuracy for each batch, and then aggregate this information into a global accuracy measurement. — Dr. Snoopy, May 31 '22 at 19:46

Lukas Rois · Answer 1 · 2022-06-01T14:39:10.850

0

In case anybody finds this with the same problem use this:

total_count = 0
accuracy_count = 0
for images, y_batch in train_set:
     predict_batch = classifier.predict(images)
     for i in range(batch_size):
             total_count += 1
             if np.argmax(predict_batch[i]) == np.argmax(y_batch[i]):
                     accuracy_count += 1
print(total_count/accuracy_count)

edited Jun 01 '22 at 14:39

answered Jun 01 '22 at 11:46

Lukas Rois

31
5

Why is classifier.evalute() incorrectly identifying accuracy?

1 Answers1