I have a data set of images and the object's bounding box coordinates, I choose 5 objects from the data set to do classification. For training, I cropped every objects in every image using the bounding box coordinates. Then, one hot encoding(object names - ['A', 'B', 'C', 'D', 'E']), train-test split, finally training. I got 99% in validation accuracy. When I test the model using the cropped images which contains only one object per image(No Background), the model perfectly classify the object.
But, When I test the model using the image which contains all the 5 objects, the prediction is not accurate. Obviously, the model returns the predicted probability for all objects, but the model always gives high probability on first object(object A) only. Like this - [[1.00000000e+00 2.64929882e-18 4.15273056e-17 1.11363124e-26 4.15807750e-22]]
I do not understand, the test image contains all the 5 objects, I thought the model gives high probability on all the objects instead of giving high probability on only one object(Object A). Could you please help me understand this issue. What should I do to get high probability on all the objects if I give an input image which contains all the 5 objects.
Code -
Labels handling -
le = sklearn.preprocessing.LabelEncoder()
y = le.fit_transform(labels)
y = keras.utils.np_utils.to_categorical(y,5)
The variable 'labels' is an numpy array of object names('A', 'B', 'C', 'D', 'E')
Model -
model = Sequential()
model.add(Conv2D(32, (3, 3), padding='same', input_shape=(224, 224, 3), activation="relu"))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(filters = 64, kernel_size = (3,3),padding = 'Same',activation ='relu'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Conv2D(filters = 96, kernel_size = (3,3),padding = 'Same',activation ='relu'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Conv2D(filters = 128, kernel_size = (3,3),padding = 'Same',activation ='relu'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Conv2D(filters = 256, kernel_size = (3,3),padding = 'Same',activation ='relu'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Flatten())
model.add(Dense(512, activation="relu"))
model.add(Dropout(0.2))
model.add(Dense(256, activation="relu"))
model.add(Dropout(0.5))
model.add(Dense(5, activation="softmax"))
model.compile(
loss='categorical_crossentropy',
optimizer= Adam(),
metrics=['accuracy']
)
model.summary()
History = model.fit(x_train, y_train , epochs=30, verbose = 1,
validation_data=(x_test, y_test),
batch_size = 128,
shuffle=True,
)
Output of metrics.classification_report() -
precision recall f1-score support
0 1.00 1.00 1.00 246
1 1.00 1.00 1.00 284
2 1.00 1.00 1.00 266
3 1.00 1.00 1.00 284
4 0.99 1.00 1.00 241
accuracy 1.00 1804
macro avg 1.00 1.00 1.00 1804
weighted avg 1.00 1.00 1.00 1804