9

I am using a deep neural network model (implemented in keras)to make predictions. Something like this:

def make_model():
 model = Sequential()       
 model.add(Conv2D(20,(5,5), activation = "relu"))
 model.add(MaxPooling2D(pool_size=(2,2)))    
 model.add(Flatten())
 model.add(Dense(20, activation = "relu"))
 model.add(Lambda(lambda x: tf.expand_dims(x, axis=1)))
 model.add(SimpleRNN(50, activation="relu"))
 model.add(Dense(1, activation="sigmoid"))    
 model.compile(loss = "binary_crossentropy", optimizer = adagrad, metrics = ["accuracy"])

 return model

model = make_model()
model.fit(x_train, y_train, validation_data = (x_validation,y_validation), epochs = 25, batch_size = 25, verbose = 1)

##Prediciton:
prediction = model.predict_classes(x)
probabilities = model.predict_proba(x) #I assume these are the probabilities of class being predictied

My problem is a classification(binary) problem. I wish to calculate the confidence score of each of these prediction i.e. I wish to know - Is my model 99% certain it is "0" or is it 58% it is "0".

I have found some views on how to do it, but can't implement them. The approach I wish to follow says: "With classifiers, when you output you can interpret values as the probability of belonging to each specific class. You can use their distribution as a rough measure of how confident you are that an observation belongs to that class."

How should I predict with something like above model so that I get its confidence about each predictions? I would appreciate some practical examples (preferably in Keras).

yamini goel
  • 519
  • 2
  • 10
  • 23

3 Answers3

9

The softmax is a problematic way to estimate a confidence of the model`s prediction.

There are a few recent papers about this topic.

You can look for "calibration" of neural networks in order to find relevant papers.

This is one example you can start with - https://arxiv.org/pdf/1706.04599.pdf

theletz
  • 1,713
  • 2
  • 16
  • 22
4

In Keras, there is a method called predict() that is available for both Sequential and Functional models. It will work fine in your case if you are using binary_crossentropy as your loss function and a final Dense layer with a sigmoid activation function.

Here is how to call it with one test data instance. Below, mymodel.predict() will return an array of two probabilities adding up to 1.0. These values are the confidence scores that you mentioned. You can further use np.where() as shown below to determine which of the two probabilities (the one over 50%) will be the final class.

yhat_probabilities = mymodel.predict(mytestdata, batch_size=1)
yhat_classes = np.where(yhat_probabilities > 0.5, 1, 0).squeeze().item()

I've come to understand that the probabilities that are output by logistic regression can be interpreted as confidence.

Here are some links to help you come to your own conclusion.

https://machinelearningmastery.com/how-to-score-probability-predictions-in-python/

how to assess the confidence score of a prediction with scikit-learn

https://stats.stackexchange.com/questions/34823/can-logistic-regressions-predicted-probability-be-interpreted-as-the-confidence

https://kiwidamien.github.io/are-you-sure-thats-a-probability.html

Feel free to upvote my answer if you find it useful.

stackoverflowuser2010
  • 38,621
  • 48
  • 169
  • 217
  • 1
    Even I was thinking of using 'softmax' and am currently using `predict` to get these probabilities.But, post(mlg.eng.cam.ac.uk/yarin/blog_3d801aa532c1ce.html) seems to disagree with your assumption that these values are the confidence score: "In this model we feed our prediction into a softmax which gives us probabilities for the different classes. Interestingly enough, these probabilities are not enough to see if our model is certain in its prediction or not. This is because the standard model would pass the predictive mean through the softmax rather than the entire distribution." – yamini goel Jan 22 '20 at 04:42
  • 1
    I was initially doing exactly what you are telling, but my only concern is - is this approach even valid for NN? Could you plz cite some source suggesting this technique for NN. – yamini goel Jan 22 '20 at 05:00
0

How about to use a softmax as the activation in the last layer? Let's say something like this:

model.add(Dense(2, activation='softmax'))    
model.compile(loss='categorical_crossentropy', optimizer = adagrad, metrics = ["accuracy"])

In this way, for each data point, you will be given a probabilistic-ish result by the model, which tells what is the likelihood that your data point belongs to each of two classes.

For example for a given X, if the model returns (0.3,0.7), you will know it is more likely that X belongs to class 1 than class 0. and you know that the likelihood has been estimated to be 0.7 over 0.3.

alift
  • 1,855
  • 2
  • 13
  • 28
  • 1
    Thank you for the answer. Even I was thinking of using 'softmax', however the post(http://mlg.eng.cam.ac.uk/yarin/blog_3d801aa532c1ce.html) seems to disagree with you: "In this model we feed our prediction into a softmax which gives us probabilities for the different classes (the 10 digits). Interestingly enough, these probabilities are not enough to see if our model is certain in its prediction or not. This is because the standard model would pass the predictive mean through the softmax rather than the entire distribution." – yamini goel Jan 22 '20 at 03:37