1

I'm trying to set a custom binary classification model in tensorflow and the frame of this model looks like this when I am training this model on the dataset, it all goes right; (Output). But when I try to evaluate or predict it goes wrong and it looks like this all the predicted results of this dataset have high score on the 2nd label and I don't know why. Here's the code:

def getModel():
    model = tf.keras.models.Sequential([
        tf.keras.layers.Flatten(input_shape=(64, 64)),
        tf.keras.layers.Dense(64, activation='relu'),
        tf.keras.layers.Dropout(0.1),
        tf.keras.layers.Dense(16, activation='relu'),
        tf.keras.layers.Dense(2, activation='softmax')
    ])

    model.summary()
    lossFn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)
    initial_learning_rate = 0.3
    decay_steps = 1.0
    decay_rate = 0.5
    # learning_rate_fn = tf.keras.optimizers.schedules.InverseTimeDecay(initial_learning_rate, decay_steps, decay_rate)

    model.compile(optimizer=tf.keras.optimizers.RMSprop(),
                  loss=lossFn,
                  metrics='accuracy')

    return model


def getDataset(path):
    with np.load(path) as data:
        trainData = data['img']
        # trainData = np.reshape(trainData, (trainData.shape[0], trainData.shape[1], trainData.shape[2]))
        trainLabels = data['label']
    trainSet = tf.data.Dataset.from_tensor_slices((trainData, trainLabels))
    trainSet = trainSet.batch(BATCH_SIZE)

    return trainSet


def getTestSet(path):
    with np.load(path) as data:
        testData = data['img']
        # testData = np.reshape(testData, (testData.shape[0], testData.shape[1], testData.shape[2]))
        testLabels = data['label']
    testSet = tf.data.Dataset.from_tensor_slices((testData, testLabels))
    testSet = testSet.batch(BATCH_SIZE)

    return testSet


def getAcc(history):
    plt.plot(history.history['accuracy'], label='accuracy')
    # plt.plot(history.history['val_accuracy'], label = 'val_accuracy')

    plt.xlabel('Epoch')
    plt.ylabel('acc')
    plt.legend(loc='lower right')
    plt.savefig('./acc.png')
    plt.clf()


def getLoss(history):
    plt.plot(history.history['loss'], label='loss')
    plt.xlabel('Epoch')
    plt.ylabel('loss')
    plt.legend(loc='lower right')
    plt.savefig('./loss.png')


if __name__ == "__main__": model = getModel()
    dataset = getDataset('./train.npz')
    checkpoint_path = "training_1/cp.ckpt"
    checkpoint_dir = os.path.dirname(checkpoint_path)
    cp_callback = tf.keras.callbacks.ModelCheckpoint(filepath=checkpoint_path,
                                                 save_weights_only=True,
                                                 verbose=1)
    history = model.fit(dataset, epochs=5, callbacks=[cp_callback])
    getAcc(history)
    getLoss(history)
    model.save('GRaymodel.h5')
    print('training done!!')
    test = getTestSet('test.npz')
    loss, acc = model.evaluate(test)
    print(model.predict(test))
    print('Restored model, accuracy: {:5.2f}%'.format(100 * acc))

Pythoneer
  • 319
  • 1
  • 16
Ice Wind
  • 21
  • 3
  • What is the distribution of your data? is it a balanced dataset? How many samples fall into categories A & B? I think you overfitted the model on Category A. – Amir Feb 15 '23 at 10:56
  • I can see many issue, (1) you said binary but you use loss function that is used in multi-class. (2). you set `from_logit=True` but `activations` of your last layer is not `None` but you set `softmax`. See this answer, [(1)](https://stackoverflow.com/a/67467084/9215780), [(2)](https://stackoverflow.com/a/67851641/9215780) – Innat Feb 15 '23 at 14:22
  • @Innat thx bro. i am a noob on DP so there're lots of mistakes ,anyway thx – Ice Wind Feb 16 '23 at 04:56
  • @Amir A & B are half to half ,so probably problems are on my model – Ice Wind Feb 16 '23 at 05:03
  • @Innat I try to fix it today but failed, as u say i used `sigmoid` and i change lossFn as binary_crossentropy. but the problem is still here, all the data detected to ones – Ice Wind Feb 17 '23 at 09:09

1 Answers1

0

This model is working absolutely fine. You just need to remove the softmax activation function from the getModel() as SparseCategoricalCrossentropy() loss will automatically calculate the logits of the dataset when it sets 'True' and will encode the label into a probability distribution. So you may need to use argmax() to convert these probabilities to find exact index of the predicted label while model prediction.

Please use the below code for the model prediction:(Please find the relevent gist here for your reference)

print('\nModel prediction!')
with np.load(path) as data:
    testData = data['x_test']
    # testData = np.reshape(testData, (testData.shape[0], testData.shape[1], testData.shape[2]))
    testLabels = data['y_test']
pred = model.predict(testData)
print("Predicted Value:", np.argmax(pred[5]))

print("Actual Value:", testLabels[5])

Output:

Model prediction!
313/313 [==============================] - 1s 2ms/step
Predicted Value: 1
Actual Value: 1