4

I am trying to classify Credit Card Fraud with a Keras model. Because the dataset is imbalanced, I need to use f1_score to improve the recall.

Apparently, it is not accepting the f1 function definition. How to monitor my new metrics in each epoch? The early stopping works fine if with val_loss but not with the defined ones. I receive this message:

Train on 139554 samples, validate on 59810 samples
Epoch 1/10

7s - loss: 0.3585 - acc: 0.9887 - val_loss: 0.0560 - val_acc: 0.9989
/home/libardo/anaconda3/lib/python3.6/site-packages/keras/callbacks.py:526: RuntimeWarning: Early stopping conditioned on metric f1s which is not available. Available metrics are: val_loss,val_acc,loss,acc
(self.monitor, ','.join(list(logs.keys()))), RuntimeWarning

EarlyStopping is ignoring my custom metrics defined #10018

Remark: It was not possible for me to paste my code here. I apologize for that.

thepurpleowl
  • 147
  • 4
  • 15
user3591356
  • 95
  • 2
  • 9

1 Answers1

5

I realize this was posted a long time back, but I found this question while searching for the same answer and eventually figured it out myself. In short, you need to remember to both define the metric for the EarlyStopping Callback and as a metric when compiling the model

OK, so you've defined your custom loss function or metric with something like this (taken from https://github.com/keras-team/keras/issues/10018 which itself was taken from https://stackoverflow.com/a/45305384/5210098):

#https://stackoverflow.com/a/45305384/5210098
def f1_metric(y_true, y_pred):

    def recall(y_true, y_pred):
        true_positives = K.sum(K.round(K.clip(y_true * y_pred, 0, 1)))
        possible_positives = K.sum(K.round(K.clip(y_true, 0, 1)))
        recall = (true_positives + K.epsilon()) / (possible_positives + K.epsilon())
        return recall

    def precision(y_true, y_pred):
        true_positives = K.sum(K.round(K.clip(y_true * y_pred, 0, 1)))
        predicted_positives = K.sum(K.round(K.clip(y_pred, 0, 1)))
        precision = (true_positives + K.epsilon()) / (predicted_positives + K.epsilon())
        return precision

    precision = precision(y_true, y_pred)
    recall = recall(y_true, y_pred)
    return 2*((precision*recall)/(precision+recall+K.epsilon()))

Now, to use this with your EarlyStopping Callback, you can provide it as a string like EarlyStopping(monitor='f1_metric') or, to monitor against validation use EarlyStopping(monitor='val_f1_metric') instead.

But that's not enough! If you stop there, you'll get the error you got. You need to also supply the actual function as an argument when you compile your model using model.compile(metrics=[f1_metric]). Note the lack of quotation marks -- you are referencing the function itself.

If you compile the model by including the function using the metrics keyword and also include the EarlyStopping Callback, then it should work cleanly.

Matt
  • 51
  • 1
  • 4
  • Hi Matt, wouldn't it make more sense to add epsilon to the numerator of recall and precision too? Imagine a batch with no relevant documents: a perfect system would predict no documents (the right answer IMO), but it's being punished with a recall of 0/ε=0 instead of a more appropriate score of ε/ε=1. The same holds for precision: no predicted documents = no spurios documents = perfect precision. – lenz Mar 15 '19 at 15:38
  • Ah! Yes it would. I just took the code from the original post but this definitely is an improvement. When there are 0 predictions and also 0 cases, it should be ε/ε=1 as you suggest. I've since edited my answer. – Matt Apr 02 '19 at 13:03
  • Oh hey, in the meantime I figured out why it's usually done this way. The problem with F-score is that you get different results if you calculate it over the whole dataset or average over F-scores from each batch (micro vs. macro average, but batches aren't a meaningful unit). With 0/ε you get an underestimation, while ε/ε tends to overestimate the globally calculated F-score. Either way, it's a rough approximation. – lenz Apr 02 '19 at 15:24
  • This didn't solve mine. The model compiles with the custom metric as defined above, and you see it while it trains, but as soon as you hit epoch(patience + 1) earlystopping crashes. Any other thoughts as to what it might be? – Mike_K Nov 14 '19 at 03:13
  • @Mike_K I'd probably have to see the error and the relevant code to know for sure, unfortunately. My hunch is that it's trying to stop training at that point and it's causing some failure. If that's the cause, there may be multiple problems: 1. it seems to never improve in the `patience` epochs (shouldn't it have improved by then?) and 2. it can't stop correctly. – Matt Nov 15 '19 at 18:12