I'm training a model to classify 3 types of vehicles. Everything seems like going well until I try to predict anything with my model. Results of prediction are completely random.
I'm using 15000 images of 3 classes (5000 each) to train and 6000 images of 3 classes (2000 each) to validate with some Data Augmentation.
I'm using Keras with Tensorflow-GPU backend, making classification report and confusion matrix with use of scikit-learn. I have no idea why when I'm training the model loss and accuracy both in train and validation are low/high respectively, but confusion matrix is completely random.
Image Data Generators
img_width, img_height = 96, 96
train_data_dir = 'data/train'
validation_data_dir = 'data/validation'
nb_train_samples = 15000
nb_validation_samples = 6000
epochs = 200
batch_size = 64
train_datagen = ImageDataGenerator(
rescale=1./255,
shear_range=0.3,
zoom_range=0.3,
horizontal_flip=True
)
test_datagen = ImageDataGenerator(rescale=1./256)
train_generator = train_datagen.flow_from_directory(train_data_dir, target_size=(
img_width, img_height), batch_size=batch_size, class_mode='categorical', shuffle=True)
validation_generator = test_datagen.flow_from_directory(validation_data_dir, target_size=(
img_width, img_height), batch_size=batch_size, class_mode='categorical', shuffle=True)
Building model
model = Sequential()
model.add(Conv2D(32, (5, 5), input_shape=image_shape))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(32, (5, 5)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Flatten())
model.add(Dense(100))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(100))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(3))
model.add(Activation('softmax'))
adam = optimizers.adam(lr=0.0016643)
model.compile(optimizer=adam, loss='categorical_crossentropy',
metrics=['accuracy'])
Model fitting
es = keras.callbacks.EarlyStopping(
monitor='val_loss', min_delta=0, patience=5, verbose=0, mode='auto')
history = model.fit_generator(train_generator, steps_per_epoch=nb_train_samples//batch_size, epochs=epochs,
validation_data=validation_generator, validation_steps=nb_validation_samples//batch_size, callbacks=[es], use_multiprocessing=False, workers=6)
Testing my model
Y_pred = model.predict_generator(
validation_generator, steps=nb_validation_samples // batch_size)
y_pred = np.argmax(Y_pred, axis=1)
print('Confusion Matrix')
cm = confusion_matrix(validation_generator.classes, y_pred)
print(cm)
print('Classification Report')
target_names = ['Car', 'Bus', 'Truck']
print(classification_report(validation_generator.classes,
y_pred, target_names=target_names))
The output I get looks something like this:
Output
Epoch 1/200
234/234 [==============================] - 35s 149ms/step - loss: 0.9103 - acc: 0.5645 - val_loss: 0.6354 - val_acc: 0.7419
Epoch 2/200
234/234 [==============================] - 30s 130ms/step - loss: 0.6804 - acc: 0.7181 - val_loss: 0.4679 - val_acc: 0.8117
Epoch 3/200
234/234 [==============================] - 30s 129ms/step - loss: 0.6027 - acc: 0.7573 - val_loss: 0.4401 - val_acc: 0.8238
.
.
.
Epoch 37/200
234/234 [==============================] - 30s 128ms/step - loss: 0.2667 - acc: 0.9018 - val_loss: 0.2095 - val_acc: 0.9276
Epoch 38/200
234/234 [==============================] - 30s 129ms/step - loss: 0.2711 - acc: 0.9037 - val_loss: 0.1995 - val_acc: 0.9353
##Here it breaks with an EarlyStopping
Confusion Matrix
[[659 680 661]
[684 636 680]
[657 658 685]]
Classification Report
precision recall f1-score support
Car 0.33 0.33 0.33 2000
Bus 0.32 0.32 0.32 2000
Truck 0.34 0.34 0.34 2000
micro avg 0.33 0.33 0.33 6000
macro avg 0.33 0.33 0.33 6000
weighted avg 0.33 0.33 0.33 6000
dict_keys(['val_loss', 'val_acc', 'loss', 'acc'])