43

I would like to calculate NN model certainty/confidence (see What my deep model doesn't know) - when NN tells me an image represents "8", I would like to know how certain it is. Is my model 99% certain it is "8" or is it 51% it is "8", but it could also be "6"? Some digits are quite ambiguous and I would like to know for which images the model is just "flipping a coin".

I have found some theoretical writings about this but I have trouble putting this in code. If I understand correctly, I should evaluate a testing image multiple times while "killing off" different neurons (using dropout) and then...?

Working on MNIST dataset, I am running the following model:

from keras.models import Sequential
from keras.layers import Dense, Activation, Conv2D, Flatten, Dropout

model = Sequential()
model.add(Conv2D(128, kernel_size=(7, 7),
                 activation='relu',
                 input_shape=(28, 28, 1,)))
model.add(Dropout(0.20))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(Dropout(0.20))
model.add(Flatten())
model.add(Dense(units=64, activation='relu'))
model.add(Dropout(0.25))
model.add(Dense(units=10, activation='softmax'))
model.summary()
model.compile(loss='categorical_crossentropy',
              optimizer='sgd',
              metrics=['accuracy'])
model.fit(train_data, train_labels,  batch_size=100, epochs=30, validation_data=(test_data, test_labels,))

How should I predict with this model so that I get its certainty about predictions too? I would appreciate some practical examples (preferably in Keras, but any will do).

To clarify, I am looking for an example of how to get certainty using the method outlined by Yurin Gal (or an explanation of why some other method yields better results).

nbro
  • 15,395
  • 32
  • 113
  • 196
johndodo
  • 17,247
  • 15
  • 96
  • 113
  • You can use Monte Carlo Dropout methodology to compute prediction uncertainties (https://stackoverflow.com/a/71750927/10375049). Here are two useful applications in classification (https://towardsdatascience.com/when-your-neural-net-doesnt-know-a-bayesian-approach-with-keras-4782c0818624) and regression (https://towardsdatascience.com/extreme-event-forecasting-with-lstm-autoencoders-297492485037) contexts – Marco Cerliani Apr 11 '22 at 06:46

4 Answers4

34

If you want to implement dropout approach to measure uncertainty you should do the following:

  1. Implement function which applies dropout also during the test time:

    import keras.backend as K
    f = K.function([model.layers[0].input, K.learning_phase()],
                   [model.layers[-1].output])
    
  2. Use this function as uncertainty predictor e.g. in a following manner:

    def predict_with_uncertainty(f, x, n_iter=10):
        result = numpy.zeros((n_iter,) + x.shape)
    
        for iter in range(n_iter):
            result[iter] = f(x, 1)
    
        prediction = result.mean(axis=0)
        uncertainty = result.var(axis=0)
        return prediction, uncertainty
    

Of course you may use any different function to compute uncertainty.

Marcin Możejko
  • 39,542
  • 10
  • 109
  • 120
  • 1
    This looks like exactly what I was looking for! Unfortunately the bounty expired while I was away), so I'll start and award another one. Thank you! (EDIT: but of course, I can only award double, and only after 24 hours... so till tomorrow it is... :) ) – johndodo May 01 '17 at 18:00
  • What does n_iter represent in your function @Marcin Możejko – Vincent Pakson Apr 23 '18 at 05:59
  • 1
    When you say uncertainty, when it says 0.93, does it imply that it is 93% uncertain or is it 93% certain that it is the choice? – Vincent Pakson Apr 23 '18 at 08:11
  • Having issues with this function. what tensorflow, keras version was this written with? – hisairnessag3 May 25 '18 at 03:48
  • 1
    I don't understand this.. won't the model predict the exact same value for the same input each time? In that case, the var (variance) will be 0 each time.. When I implemented this for my code.. I got the exact same value predicted all 10 times (n_iter=10) – sand Apr 10 '19 at 13:27
  • one question to this answer. That way of implementing bayesian dropout would not make other layers (like BN) behave also in training mode?. – jdeJuan May 05 '19 at 16:34
6

Made a few changes to the top voted answer. Now it works for me.

It's a way to estimate model uncertainty. For other source of uncertainty, I found https://eng.uber.com/neural-networks-uncertainty-estimation/ helpful.

f = K.function([model.layers[0].input, K.learning_phase()],
               [model.layers[-1].output])


def predict_with_uncertainty(f, x, n_iter=10):
    result = []

    for i in range(n_iter):
        result.append(f([x, 1]))

    result = np.array(result)

    prediction = result.mean(axis=0)
    uncertainty = result.var(axis=0)
    return prediction, uncertainty
Chexn
  • 61
  • 1
  • 5
  • I have tried using your implementation here and while it seems to function, I only seem to receive a matrix of the same predictions, and a matrix of uncertainties containing on 0's? Any help gratefully received! thanks! – cmp Dec 04 '19 at 21:30
  • You are getting 0s because dropout is diabled during inference. Only if it is enabled while training you will get different results. You can set parameter trainable=Tue in the dropout layer. You can find the detailed article here - https://towardsdatascience.com/is-your-algorithm-confident-enough-1b20dfe2db08 – Malgo Oct 27 '21 at 20:14
3

Your model uses a softmax activation, so the simplest way to obtain some kind of uncertainty measure is to look at the output softmax probabilities:

probs = model.predict(some input data)[0]

The probs array will then be a 10-element vector of numbers in the [0, 1] range that sum to 1.0, so they can be interpreted as probabilities. For example the probability for digit 7 is just probs[7].

Then with this information you can do some post-processing, typically the predicted class is the one with highest probability, but you can also look at the class with second highest probability, etc.

Dr. Snoopy
  • 55,122
  • 7
  • 121
  • 140
  • 3
    Thank you for the answer, however the linked [post](http://mlg.eng.cam.ac.uk/yarin/blog_3d801aa532c1ce.html) seems to disagree with you: *"In this model we feed our prediction into a softmax which gives us probabilities for the different classes (the 10 digits). Interestingly enough, **these probabilities are not enough** to see if our model is certain in its prediction or not. This is because the standard model would pass the predictive mean through the softmax rather than the entire distribution."* Am I missing something? – johndodo Apr 21 '17 at 15:26
  • 2
    @johndodo It doesn't disagree with my answer, I never claimed to have the best method, just the simplest one. – Dr. Snoopy Apr 21 '17 at 17:50
  • True. Does it work though? :) I still hope to find some other answer too... – johndodo Apr 21 '17 at 19:37
  • 1
    The curve doesn't match that of a nice certainty function. You can get a roundabout idea (if it's very high or very low), but that's about it. You can't depend on the prediction output for certainty – Araymer Aug 26 '17 at 17:08
  • this one works for me. I have 4 classes. and I could use the probability to detect the confidence interval? – Nufa Oct 15 '18 at 02:50
  • 1
    I do agree that a probabilty gives some idea about uncertainty of outcome, but the Asker is referencing **model uncertainty**, which is like asking the question: what is the ucertainty of the proba-prediction by the model? You can have a very high certainty about a very 'ambivalent' probability, and vice-versa. For instance. You can be 99% certain about a probability being in the range [49.9 - 50.1%], for instance, when you flip a coin you know that probability if landing heads is ~50%. – Arjan Groen Aug 05 '19 at 12:33
  • @ArjanGroen I don't see how your comment contributes anything new here, we are talking about probabilities of a model, if you calibrate these probabilities, they actually mean what they should mean. – Dr. Snoopy Aug 05 '19 at 14:33
  • @MatiasValdenegro I agree about your comment on probabilities. What my comment aims to contribute to is clarifying the difference between the concepts uncertainty and probability. – Arjan Groen Aug 05 '19 at 14:47
  • 2
    Softmax probabilities are not a good metric to measure uncertainty. See https://arxiv.org/abs/1703.04977 – Gilfoyle Feb 04 '20 at 12:25
3

A simpler way is to set training=True on any dropout layers you want to run during inference as well (essentially tells the layer to operate as if it's always in training mode - so it is always present for both training and inference).

import keras

inputs = keras.Input(shape=(10,))
x = keras.layers.Dense(3)(inputs)
outputs = keras.layers.Dropout(0.5)(x, training=True)

model = keras.Model(inputs, outputs)

Code above is from this issue.

abagshaw
  • 6,162
  • 4
  • 38
  • 76
  • How can I use your approach if I use `keras.Sequential([])`? – Gilfoyle Feb 04 '20 at 12:28
  • @Samuel I'm late to the party, but maybe it helps others: **No, `training=True` cannot be set if you use the sequential API** - you need to use the functional API as shown in the example. That said, you may want to check out [uncertainty-wizard](https://github.com/testingautomated-usi/uncertainty-wizard), which allows achieving the same in the Sequential API (Disclaimer: I'm the author) – miwe Feb 10 '21 at 15:32