0

I have a data set of images and the object's bounding box coordinates, I choose 5 objects from the data set to do classification. For training, I cropped every objects in every image using the bounding box coordinates. Then, one hot encoding(object names - ['A', 'B', 'C', 'D', 'E']), train-test split, finally training. I got 99% in validation accuracy. When I test the model using the cropped images which contains only one object per image(No Background), the model perfectly classify the object.

But, When I test the model using the image which contains all the 5 objects, the prediction is not accurate. Obviously, the model returns the predicted probability for all objects, but the model always gives high probability on first object(object A) only. Like this - [[1.00000000e+00 2.64929882e-18 4.15273056e-17 1.11363124e-26 4.15807750e-22]]

I do not understand, the test image contains all the 5 objects, I thought the model gives high probability on all the objects instead of giving high probability on only one object(Object A). Could you please help me understand this issue. What should I do to get high probability on all the objects if I give an input image which contains all the 5 objects.

Code -

Labels handling -

le = sklearn.preprocessing.LabelEncoder()
y = le.fit_transform(labels)
y = keras.utils.np_utils.to_categorical(y,5)

The variable 'labels' is an numpy array of object names('A', 'B', 'C', 'D', 'E')

Model -

  model = Sequential()

  model.add(Conv2D(32, (3, 3), padding='same', input_shape=(224, 224, 3), activation="relu"))
  model.add(MaxPooling2D(pool_size=(2, 2)))

  model.add(Conv2D(filters = 64, kernel_size = (3,3),padding = 'Same',activation ='relu'))
  model.add(MaxPooling2D(pool_size=(2,2)))

  model.add(Conv2D(filters = 96, kernel_size = (3,3),padding = 'Same',activation ='relu'))
  model.add(MaxPooling2D(pool_size=(2,2)))

  model.add(Conv2D(filters = 128, kernel_size = (3,3),padding = 'Same',activation ='relu'))
  model.add(MaxPooling2D(pool_size=(2,2)))

  model.add(Conv2D(filters = 256, kernel_size = (3,3),padding = 'Same',activation ='relu'))
  model.add(MaxPooling2D(pool_size=(2,2)))

  model.add(Flatten())
  model.add(Dense(512, activation="relu"))
  model.add(Dropout(0.2))
  model.add(Dense(256, activation="relu"))
  model.add(Dropout(0.5))
  model.add(Dense(5, activation="softmax"))

  model.compile(
  loss='categorical_crossentropy',
  optimizer= Adam(),
  metrics=['accuracy']
  )

  model.summary()

  History = model.fit(x_train, y_train , epochs=30, verbose = 1,
              validation_data=(x_test, y_test), 
              batch_size = 128, 
              shuffle=True,
              )

Output of metrics.classification_report() -

                    precision    recall  f1-score   support

                0       1.00      1.00      1.00       246
                1       1.00      1.00      1.00       284
                2       1.00      1.00      1.00       266
                3       1.00      1.00      1.00       284
                4       0.99      1.00      1.00       241

          accuracy                           1.00      1804
        macro avg       1.00      1.00      1.00      1804
      weighted avg       1.00      1.00      1.00      1804
desertnaut
  • 57,590
  • 26
  • 140
  • 166
  • The activation of the output layer is softmax... So you will never get a high probability for all of your classes(Because the probability should sum-up to 1). – Mitiku Aug 07 '20 at 05:05
  • @Mitiku, thanks for the reply, could you please suggest me what other activation function should I use to identify that all the objects are presented in the given input image –  Aug 07 '20 at 05:26
  • You can convert the task to binary classification and use sigmoid activation. – Mitiku Aug 07 '20 at 05:30
  • @Mitiku, binary classification means, classification on every single object with background, correct... But when I test more images, it takes time, Is there any other option.... –  Aug 07 '20 at 05:39
  • If you modify the loss function, the training time will be the same. I will post the loss function you should use in this case on the answer part. – Mitiku Aug 07 '20 at 05:43
  • @Mitiku, Could you please tell me how to do that, thanks –  Aug 07 '20 at 06:16
  • Your problem is actually called *multi-label* classification, in which, contrary to the simple multi-class one, a sample can belong to more than one classes simultaneously. See the duplicate thread for details. – desertnaut Aug 07 '20 at 09:09

0 Answers0