3

Is there a way to get the loss of the model, with it's current weights, without running evaluate, or fit, on it?

model = keras.Sequential([
    keras.layers.Input(400),
    keras.layers.Dense(25, activation=tf.nn.sigmoid, kernel_regularizer=regularizers.l2(lambd)),
    keras.layers.Dense(10, activation=tf.nn.softmax)
])
model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',  # does the 1-hot encoding for us
              metrics=['accuracy'])
model.set_weights([Theta1.T, np.zeros(25), Theta2.T, np.zeros(10)])
prob = model.predict(X)
pred = np.argmax(prob, axis=1).reshape(-1, 1)
pred_y = pred == y
print(f'Training Set Accuracy: {np.mean(pred_y)*100:.2f}%') 
# How do I get the loss now?

This didn't work for me.

Maverick Meerkat
  • 5,737
  • 3
  • 47
  • 66

2 Answers2

0

You can, via passing the outputs of model.predict(x) to an implementation of the loss function. In addition, you'll need a function to compute model regularization losses - reg_loss(model). Below is an implementation of binary_crossentropy, and l1, l2, and l1_l2 losses from all layers, including recurrent - but does not include activity_regularizer losses, which aren't weight losses. You can replace binary_crossentropy with your own function - e.g. sparse_softmax_crossentropy_with_logits:


WORKING IMPLEMENTATION: (numerically stable version)
def binary_crossentropy(y_true, y_pred, sample_weight=1):
    if len(y_pred.shape)==1:
        y_pred = np.atleast_2d(y_pred).T
    y_pred = [max(min(pred[0], 1-K.epsilon()), K.epsilon()) for pred in y_pred]
    y_true,y_pred,sample_weight = force_2d_shape([y_true,y_pred,sample_weight])

    logits = np.log(y_pred) - np.log(1-y_pred) # sigmoid inverse
    neg_abs_logits = -np.abs(logits)
    relu_logits    = (logits > 0)*logits

    loss_vec = relu_logits - logits*y_true + np.log(1 + np.exp(neg_abs_logits))
    return np.mean(sample_weight*loss_vec)

def force_2d_shape(arr_list):
    for arr_idx, arr in enumerate(arr_list):
        if len(np.array(arr).shape) != 2:
            arr_list[arr_idx] = np.atleast_2d(arr).T
    return arr_list
def l1l2_weight_loss(model):
    l1l2_loss = 0
    for layer in model.layers:
        if 'layer' in layer.__dict__ or 'cell' in layer.__dict__:
            l1l2_loss += _l1l2_rnn_loss(layer)
            continue

        if 'kernel_regularizer' in layer.__dict__ or \
           'bias_regularizer'   in layer.__dict__:
            l1l2_lambda_k, l1l2_lambda_b = [0,0], [0,0] # defaults
            if layer.__dict__['kernel_regularizer'] is not None:
                l1l2_lambda_k = list(layer.kernel_regularizer.__dict__.values())
            if layer.__dict__['bias_regularizer']   is not None:
                l1l2_lambda_b = list(layer.bias_regularizer.__dict__.values())

            if any([(_lambda != 0) for _lambda in (l1l2_lambda_k + l1l2_lambda_b)]):
                W = layer.get_weights()

                for idx,_lambda in enumerate(l1l2_lambda_k + l1l2_lambda_b):
                    if _lambda != 0:
                        _pow = 2**(idx % 2) # 1 if idx is even (l1), 2 if odd (l2)
                        l1l2_loss += _lambda*np.sum(np.abs(W[idx//2])**_pow)
    return l1l2_loss
def _l1l2_rnn_loss(layer):
    l1l2_loss = 0
    if 'backward_layer' in layer.__dict__:
        bidirectional = True
        _layer = layer.layer
    else:
        _layer = layer
        bidirectional = False
    ldict = _layer.cell.__dict__

    if 'kernel_regularizer'    in ldict or \
       'recurrent_regularizer' in ldict or \
       'bias_regularizer'      in ldict:
        l1l2_lambda_k, l1l2_lambda_r, l1l2_lambda_b = [0,0], [0,0], [0,0]
        if ldict['kernel_regularizer']    is not None:
            l1l2_lambda_k = list(_layer.kernel_regularizer.__dict__.values())
        if ldict['recurrent_regularizer'] is not None:
            l1l2_lambda_r = list(_layer.recurrent_regularizer.__dict__.values())
        if ldict['bias_regularizer']      is not None:
            l1l2_lambda_b = list(_layer.bias_regularizer.__dict__.values())

        all_lambda = l1l2_lambda_k + l1l2_lambda_r + l1l2_lambda_b
        if any([(_lambda != 0) for _lambda in all_lambda]):
            W = layer.get_weights()
            idx_incr = len(W)//2 # accounts for 'use_bias'

            for idx,_lambda in enumerate(all_lambda):
                if _lambda != 0:
                    _pow = 2**(idx % 2) # 1 if idx is even (l1), 2 if odd (l2)
                    l1l2_loss += _lambda*np.sum(np.abs(W[idx//2])**_pow)
                    if bidirectional:
                        l1l2_loss += _lambda*np.sum(
                                    np.abs(W[idx//2 + idx_incr])**_pow)
        return l1l2_loss  


TESTING IMPLEMENTATION:
from keras.layers import Input, Dense, LSTM, GRU, Bidirectional
from keras.models import Model
from keras.regularizers import l1, l2, l1_l2
import numpy as np 

ipt   = Input(shape=(1200,16))
x     = LSTM(60, activation='relu', return_sequences=True,
                                                 recurrent_regularizer=l2(1e-3),)(ipt)
x     = Bidirectional(GRU(60, activation='relu', bias_regularizer     =l1(1e-4)))(x)
out   = Dense(1,  activation='sigmoid',          kernel_regularizer   =l1_l2(2e-4))(x)
model = Model(ipt,out)

model.compile(loss='binary_crossentropy', optimizer='adam')
X = np.random.rand(10,1200,16) # (batch_size, timesteps, input_dim)
Y = np.random.randint(0,2,(10,1))
class_weights = {'0':1, '1': 6}
sample_weights = np.array([class_weights[str(label[0])] for label in Y])
keras_loss   = model.evaluate(X,Y,sample_weight=sample_weights)
custom_loss  = binary_crossentropy(Y, model.predict(X))
custom_loss += l1l2_weight_loss(model)

print('%.6f'%keras_loss  + ' -- keras_loss')
print('%.6f'%custom_loss + ' -- custom_loss') 

0.763822 -- keras_loss
0.763822 -- custom_loss

OverLordGoldDragon
  • 1
  • 9
  • 53
  • 101
  • This is nice, but I was hoping there's a built-in function in Keras... I mean, there's a function there somewhere that calculates this loss, so don't we have access to it? – Maverick Meerkat Sep 17 '19 at 19:17
  • 1
    @DavidRefaeli If you dig deep enough into tensorflow docs and figure out all the necessary typecasting, yes, probably - but to reasons unknown to me, the docs are so convoluted that you'd sooner code up your own solution than find one there. If your goal's to retain `predict` outputs while sparing additional calls for performance, the only alternative I can suggest is to add a callback to `evaluate` to return predictions during runtime - which I've attempted, but didn't quite work out (don't recall why) - feel free to try, as [here](https://stackoverflow.com/q/47079111/10133797) – OverLordGoldDragon Sep 17 '19 at 21:05
  • @DavidRefaeli To clarify, I'm fairly sure there _isn't_ a single, callable built-in method for computing loss. However, I _did_ just discover some Keras methods which'd accomplish this in a few calls, if not one - have a look at [evaluate](https://github.com/keras-team/keras/blob/master/keras/engine/training.py#L1241) and [test_loop](https://github.com/keras-team/keras/blob/master/keras/engine/training_arrays.py#L342) - it may involve the use of `K.function` – OverLordGoldDragon Sep 17 '19 at 22:57
0

This is nice, but I was hoping there's a built-in function in Keras... I mean, there's a function there somewhere that calculates this loss, so don't we have access to it?

Yes, you have access to the model compiled loss and metrics and can use it after the scores were computed from a forward pass:

y_pred = model(X, training=False)
model.compiled_loss(y_test, y_pred)
model.compiled_metrics.update_state(y_test, y_pred)

Then for each of the compiled metrics in your model you can get the name and value as a dict (includes the average loss)

{m.name: m.result().numpy() for m in self.metrics}

Note 1: You have to wrap the scores (y_preds) and labels (y_true) in tf.constant if not in eager mode.

Note 2: In order to avoid mixing the compiled loss and metrics states between training and validation, you must reset the states before returning to training / validation loop.

for m in self.metrics:
    m.reset_states()

For this reason, I prefer to have a separate loss tracker and metric object

val_loss = keras.metrics.Mean(name="loss")
val_acc = keras.metrics.CategoricalAccuracy()

y_pred = model(X, training=False)
loss = model.compiled_loss(y_test, y_pred)

val_loss.update_state(loss)
val_acc.update_state(y_test, y_pred)

Source:

https://keras.io/guides/customizing_what_happens_in_fit/

fabda01
  • 3,384
  • 2
  • 31
  • 37