43

I get this error :

sum() got an unexpected keyword argument 'out'

when I run this code:

import pandas as pd, numpy as np
import keras
from keras.layers.core import Dense, Activation
from keras.models import Sequential

def AUC(y_true,y_pred):
    not_y_pred=np.logical_not(y_pred)
    y_int1=y_true*y_pred
    y_int0=np.logical_not(y_true)*not_y_pred
    TP=np.sum(y_pred*y_int1)
    FP=np.sum(y_pred)-TP
    TN=np.sum(not_y_pred*y_int0)
    FN=np.sum(not_y_pred)-TN
    TPR=np.float(TP)/(TP+FN)
    FPR=np.float(FP)/(FP+TN)
    return((1+TPR-FPR)/2)

# Input datasets

train_df = pd.DataFrame(np.random.rand(91,1000))
train_df.iloc[:,-2]=(train_df.iloc[:,-2]>0.8)*1


model = Sequential()
model.add(Dense(output_dim=60, input_dim=91, init="glorot_uniform"))
model.add(Activation("sigmoid"))
model.add(Dense(output_dim=1, input_dim=60, init="glorot_uniform"))
model.add(Activation("sigmoid"))

model.compile(optimizer='rmsprop',loss='binary_crossentropy',metrics=[AUC])


train_df.iloc[:,-1]=np.ones(train_df.shape[0]) #bias
X=train_df.iloc[:,:-1].values
Y=train_df.iloc[:,-1].values
print X.shape,Y.shape

model.fit(X, Y, batch_size=50,show_accuracy = False, verbose = 1)

Is it possible to implement a custom metric aside from doing a loop on batches and editing the source code?

Marcin Możejko
  • 39,542
  • 10
  • 109
  • 120
Philippe C
  • 667
  • 2
  • 9
  • 16

3 Answers3

67

Here I'm answering to OP's topic question rather than his exact problem. I'm doing this as the question shows up in the top when I google the topic problem.

You can implement a custom metric in two ways.

  1. As mentioned in Keras docu.

    import keras.backend as K
    
    def mean_pred(y_true, y_pred):
        return K.mean(y_pred)
    
    model.compile(optimizer='sgd',
              loss='binary_crossentropy',
              metrics=['accuracy', mean_pred])
    

    But here you have to remember as mentioned in Marcin Możejko's answer that y_true and y_pred are tensors. So in order to correctly calculate the metric you need to use keras.backend functionality. Please look at this SO question for details How to calculate F1 Macro in Keras?

  2. Or you can implement it in a hacky way as mentioned in Keras GH issue. For that you need to use callbacks argument of model.fit.

    import keras as keras
    import numpy as np
    from keras.optimizers import SGD
    from sklearn.metrics import roc_auc_score
    
    model = keras.models.Sequential()
    # ...
    sgd = SGD(lr=0.001, momentum=0.9)
    model.compile(optimizer=sgd, loss='categorical_crossentropy', metrics=['accuracy'])
    
    
    class Metrics(keras.callbacks.Callback):
        def on_train_begin(self, logs={}):
            self._data = []
    
        def on_epoch_end(self, batch, logs={}):
            X_val, y_val = self.validation_data[0], self.validation_data[1]
            y_predict = np.asarray(model.predict(X_val))
    
            y_val = np.argmax(y_val, axis=1)
            y_predict = np.argmax(y_predict, axis=1)
    
            self._data.append({
                'val_rocauc': roc_auc_score(y_val, y_predict),
            })
            return
    
        def get_data(self):
            return self._data
    
    metrics = Metrics()
    history = model.fit(X_train, y_train, epochs=100, validation_data=(X_val, y_val), callbacks=[metrics])
    metrics.get_data()
    
vogdb
  • 4,669
  • 3
  • 27
  • 29
  • I suggest using self.model rather than model so that this class can be stashed away at a different file. – Dan Erez Oct 29 '18 at 10:57
  • 9
    @For people working with large validation dataset, you will face twice the validation time. One validation done by keras and one done by your metrics by calling predict. Another issue is now your metrics uses GPU to do predict and cpu to compute metrics using numpy, thus GPU and CPU are in serial. If metric is compute expensive, you will face worse GPU utilization and will have to do optimization that are already done in keras. – saurabheights Aug 01 '19 at 01:39
  • @saurabheights any workarounds for this then using callbacks – Likith Reddy Dec 19 '20 at 11:37
  • I don't think the validation_data will be passed to the `on_epoch_end` method when I use this code, it says `None type not subscriptable` – Likith Reddy Dec 19 '20 at 14:18
  • @LikithReddy Sorry, but I dont even remember my above comment well. Used Keras quite a lot in my last company but havent used Keras in over a year. Plus, I no longer have the keras setup locally, so it will be a few hours work, to dig in and see what u r asking. ATM, I wont be able to help, too busy. If I can recall, what I did was do double GPU predict call. Validation is quite fast so GPU wasnt problem. I moved CPU computation code to another process(using futures, queue...). – saurabheights Dec 20 '20 at 08:28
  • This way, serial code would not block GPU code. Got quite high speed improvements. Plus during validation, u can use higher batch size(no gpu memory used in validation/inference), so validation runs quite fast. – saurabheights Dec 20 '20 at 08:31
  • 1
    This technique works well. Hint: Use `y_predict = np.asarray(model.predict(X_val, batch_size=32768))` to drop prediction time, in my case it went from 14 seconds to 0.40 seconds. – Contango Jan 08 '21 at 23:06
  • FYI, I am getting `AttributeError: ... object has no attribute 'validation_data'` when I go for the option with `Callback`. – Prefect May 03 '21 at 15:04
21

The problem is that y_pred and y_true are not NumPy arrays but either Theano or TensorFlow tensors. That's why you got this error.

You can define your custom metrics but you have to remember that its arguments are those tensors – not NumPy arrays.

Community
  • 1
  • 1
Marcin Możejko
  • 39,542
  • 10
  • 109
  • 120
  • 3
    Could you maybe elaborate how to solve this? How can I transform a Tensor to a numpy array? – ste Oct 13 '16 at 13:58
  • 2
    You need to think about a tensor as an algebraic variable. You cannot transform numpy array to tensor. You could only assign the numpy array as a value of a tensor. – Marcin Możejko Oct 13 '16 at 14:30
  • 5
    I don't want to transform a numpy array to a tensor but the other way around. If we take OP's function, I tried: `def AUC(y_true,y_pred): numpy_y_true = y_true.eval() ... return ...` but it didn't work. How would you solve OP's problem? – ste Oct 31 '16 at 12:28
  • 3
    A custom metric can't be a typical number-returning function. Even "def auc(y_true, y_pred): return 1.0" won't work. The metric function gets called at compile time once, before any training examples are seen. – Dustin Boswell Sep 22 '17 at 16:48
  • 1
    That's true. But still inputs need to be tensors – Marcin Możejko Sep 22 '17 at 16:50
4

you can pass a model.predict() in your AUC metric function. [this will iterate on bacthes so you might be better off using model.predict_on_batch(). Assuming you have something like a softmax layer as output (something that outputs probabilities), then you can use that together with sklearn.metric to get the AUC.

from sklearn.metrics import roc_curve, auc

from here

def sklearnAUC(test_labels,test_prediction):
    n_classes = 2
    # Compute ROC curve and ROC area for each class
    fpr = dict()
    tpr = dict()
    roc_auc = dict()
    for i in range(n_classes):
        # ( actual labels, predicted probabilities )
        fpr[i], tpr[i], _ = roc_curve(test_labels[:, i], test_prediction[:, i])
        roc_auc[i] = auc(fpr[i], tpr[i])

    return round(roc_auc[0],3) , round(roc_auc[1],3)

now make your metric

# gives a numpy array like so [ [0.3,0.7] , [0.2,0.8] ....]    
Y_pred = model.predict_on_batch ( X_test  ) 
# Y_test looks something like [ [0,1] , [1,0] .... ]
# auc1 and auc2 should be equal
auc1 , auc2 = sklearnAUC(  Y_test ,  Y_pred )
ahmedhosny
  • 1,099
  • 14
  • 25