Keras - Trying to get 'logits' - one layer before the softmax activation function

Question

I'm trying to get the 'logits' out of my Keras CNN classifier. I have tried the suggested method here: link.

First I created two models to check the implementation :

create_CNN_MNIST CNN classifier that returns the softmax probabilities.
create_CNN_MNIST_logits CNN with the same layers as in (1) with a little twist in the last layer - changed the activation function to linear to return logits.

Both models were fed with the same Train and Test data of MNIST. Then I applied softmax on the logits, I got a different output from the softmax CNN.

I couldn't find a problem in my code. Maybe you could help advise another method to extract 'logits' from the model?

the code:

def softmax(x):
    """Compute softmax values for each sets of scores in x."""
    e_x = np.exp(x - np.max(x))
    return e_x / e_x.sum(axis=0)

def create_CNN_MNIST_logits() :
    model = Sequential()
    model.add(Conv2D(32, (3, 3), activation='relu', kernel_initializer='he_uniform', input_shape=(28, 28, 1)))
    model.add(MaxPooling2D((2, 2)))
    model.add(Flatten())
    model.add(Dense(100, activation='relu', kernel_initializer='he_uniform'))
    model.add(Dense(10, activation='linear'))
    # compile model
    opt = SGD(learning_rate=0.01, momentum=0.9)
    
    def my_sparse_categorical_crossentropy(y_true, y_pred):
        return keras.losses.categorical_crossentropy(y_true, y_pred, from_logits=True)
    
    model.compile(optimizer=opt, loss=my_sparse_categorical_crossentropy, metrics=['accuracy'])
    return model

def create_CNN_MNIST() :
    model = Sequential()
    model.add(Conv2D(32, (3, 3), activation='relu', kernel_initializer='he_uniform', input_shape=(28, 28, 1)))
    model.add(MaxPooling2D((2, 2)))
    model.add(Flatten())
    model.add(Dense(100, activation='relu', kernel_initializer='he_uniform'))
    model.add(Dense(10, activation='softmax'))
    # compile model
    opt = SGD(learning_rate=0.01, momentum=0.9)
    model.compile(optimizer=opt, loss='categorical_crossentropy', metrics=['accuracy'])
    return model

# load data
X_train = np.load('./data/X_train.npy')
X_test = np.load('./data/X_test.npy')
y_train = np.load('./data/y_train.npy')
y_test = np.load('./data/y_test.npy')


#create models
model_softmax = create_CNN_MNIST()
model_logits = create_CNN_MNIST_logits()


pixels = 28
channels = 1
num_labels = 10

# Reshaping to format which CNN expects (batch, height, width, channels)
trainX_cnn = X_train.reshape(X_train.shape[0], pixels, pixels, channels).astype('float32')
testX_cnn = X_test.reshape(X_test.shape[0], pixels, pixels, channels).astype('float32')

# Normalize images from 0-255 to 0-1
trainX_cnn /= 255
testX_cnn /= 255

train_y_cnn = utils.to_categorical(y_train, num_labels)
test_y_cnn = utils.to_categorical(y_test, num_labels)


#train the models:
model_logits.fit(trainX_cnn, train_y_cnn, validation_split=0.2, epochs=10,
                          batch_size=32)
model_softmax.fit(trainX_cnn, train_y_cnn, validation_split=0.2, epochs=10,
                          batch_size=32)

On the evaluation stage, I'll do softmax on the logits to check if its the same as the regular model:

#predict
y_pred_softmax = model_softmax.predict(testX_cnn)
y_pred_logits = model_logits.predict(testX_cnn)

#apply softmax on the logits to get the same result of regular CNN
y_pred_logits_activated = softmax(y_pred_logits)

Now I get different values in both y_pred_logits_activated and y_pred_softmax that lead to different accuracy on the test set.

jhso · Answer 1 · 2022-02-14T04:19:36.620

Your models are probably being trained differently, make sure to set the seed prior to both fit commands so that they're initialised the same weights and have the same train/val split. Also, is the softmax might be incorrect:

def softmax(x):
    """Compute softmax values for each sets of scores in x."""
    e_x = np.exp(x) 
    return e_x / e_x.sum(axis=1)

This is numerically equivalent to subtracting the max (https://stackoverflow.com/a/34969389/10475762), and the axis should be 1 if your matrix is of shape [batch, outputs].

Keras - Trying to get 'logits' - one layer before the softmax activation function

1 Answers1