How to calculate prediction uncertainty using Keras?

Question

I would like to calculate NN model certainty/confidence (see What my deep model doesn't know) - when NN tells me an image represents "8", I would like to know how certain it is. Is my model 99% certain it is "8" or is it 51% it is "8", but it could also be "6"? Some digits are quite ambiguous and I would like to know for which images the model is just "flipping a coin".

I have found some theoretical writings about this but I have trouble putting this in code. If I understand correctly, I should evaluate a testing image multiple times while "killing off" different neurons (using dropout) and then...?

Working on MNIST dataset, I am running the following model:

from keras.models import Sequential
from keras.layers import Dense, Activation, Conv2D, Flatten, Dropout

model = Sequential()
model.add(Conv2D(128, kernel_size=(7, 7),
                 activation='relu',
                 input_shape=(28, 28, 1,)))
model.add(Dropout(0.20))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(Dropout(0.20))
model.add(Flatten())
model.add(Dense(units=64, activation='relu'))
model.add(Dropout(0.25))
model.add(Dense(units=10, activation='softmax'))
model.summary()
model.compile(loss='categorical_crossentropy',
              optimizer='sgd',
              metrics=['accuracy'])
model.fit(train_data, train_labels,  batch_size=100, epochs=30, validation_data=(test_data, test_labels,))

How should I predict with this model so that I get its certainty about predictions too? I would appreciate some practical examples (preferably in Keras, but any will do).

To clarify, I am looking for an example of how to get certainty using the method outlined by Yurin Gal (or an explanation of why some other method yields better results).

You can use Monte Carlo Dropout methodology to compute prediction uncertainties (https://stackoverflow.com/a/71750927/10375049). Here are two useful applications in classification (https://towardsdatascience.com/when-your-neural-net-doesnt-know-a-bayesian-approach-with-keras-4782c0818624) and regression (https://towardsdatascience.com/extreme-event-forecasting-with-lstm-autoencoders-297492485037) contexts — Marco Cerliani, Apr 11 '22 at 06:46

score 34 · Accepted Answer · answered Apr 27 '17 at 21:35

34

If you want to implement dropout approach to measure uncertainty you should do the following:

Implement function which applies dropout also during the test time:

import keras.backend as K
f = K.function([model.layers[0].input, K.learning_phase()],
               [model.layers[-1].output])

Use this function as uncertainty predictor e.g. in a following manner:

def predict_with_uncertainty(f, x, n_iter=10):
    result = numpy.zeros((n_iter,) + x.shape)

    for iter in range(n_iter):
        result[iter] = f(x, 1)

    prediction = result.mean(axis=0)
    uncertainty = result.var(axis=0)
    return prediction, uncertainty

Of course you may use any different function to compute uncertainty.

answered Apr 27 '17 at 21:35

Marcin Możejko

39,542
10
109
120

1

This looks like exactly what I was looking for! Unfortunately the bounty expired while I was away), so I'll start and award another one. Thank you! (EDIT: but of course, I can only award double, and only after 24 hours... so till tomorrow it is... :) ) – johndodo May 01 '17 at 18:00
What does n_iter represent in your function @Marcin Możejko – Vincent Pakson Apr 23 '18 at 05:59
1

When you say uncertainty, when it says 0.93, does it imply that it is 93% uncertain or is it 93% certain that it is the choice? – Vincent Pakson Apr 23 '18 at 08:11
Having issues with this function. what tensorflow, keras version was this written with? – hisairnessag3 May 25 '18 at 03:48
1

I don't understand this.. won't the model predict the exact same value for the same input each time? In that case, the var (variance) will be 0 each time.. When I implemented this for my code.. I got the exact same value predicted all 10 times (n_iter=10) – sand Apr 10 '19 at 13:27
one question to this answer. That way of implementing bayesian dropout would not make other layers (like BN) behave also in training mode?. – jdeJuan May 05 '19 at 16:34

Chexn · Answer 2 · 2019-02-27T21:44:03.620

6

Made a few changes to the top voted answer. Now it works for me.

It's a way to estimate model uncertainty. For other source of uncertainty, I found https://eng.uber.com/neural-networks-uncertainty-estimation/ helpful.

f = K.function([model.layers[0].input, K.learning_phase()],
               [model.layers[-1].output])


def predict_with_uncertainty(f, x, n_iter=10):
    result = []

    for i in range(n_iter):
        result.append(f([x, 1]))

    result = np.array(result)

    prediction = result.mean(axis=0)
    uncertainty = result.var(axis=0)
    return prediction, uncertainty

edited Feb 27 '19 at 21:44

answered Feb 27 '19 at 19:29

Chexn

61
1
5

I have tried using your implementation here and while it seems to function, I only seem to receive a matrix of the same predictions, and a matrix of uncertainties containing on 0's? Any help gratefully received! thanks! – cmp Dec 04 '19 at 21:30
You are getting 0s because dropout is diabled during inference. Only if it is enabled while training you will get different results. You can set parameter trainable=Tue in the dropout layer. You can find the detailed article here - https://towardsdatascience.com/is-your-algorithm-confident-enough-1b20dfe2db08 – Malgo Oct 27 '21 at 20:14

score 3 · Answer 3 · answered Apr 20 '17 at 23:44

3

Your model uses a softmax activation, so the simplest way to obtain some kind of uncertainty measure is to look at the output softmax probabilities:

probs = model.predict(some input data)[0]

The probs array will then be a 10-element vector of numbers in the [0, 1] range that sum to 1.0, so they can be interpreted as probabilities. For example the probability for digit 7 is just probs[7].

Then with this information you can do some post-processing, typically the predicted class is the one with highest probability, but you can also look at the class with second highest probability, etc.

answered Apr 20 '17 at 23:44

Dr. Snoopy

55,122
7
121
140

3

Thank you for the answer, however the linked [post](http://mlg.eng.cam.ac.uk/yarin/blog_3d801aa532c1ce.html) seems to disagree with you: *"In this model we feed our prediction into a softmax which gives us probabilities for the different classes (the 10 digits). Interestingly enough, **these probabilities are not enough** to see if our model is certain in its prediction or not. This is because the standard model would pass the predictive mean through the softmax rather than the entire distribution."* Am I missing something? – johndodo Apr 21 '17 at 15:26
2

@johndodo It doesn't disagree with my answer, I never claimed to have the best method, just the simplest one. – Dr. Snoopy Apr 21 '17 at 17:50
True. Does it work though? :) I still hope to find some other answer too... – johndodo Apr 21 '17 at 19:37
1

The curve doesn't match that of a nice certainty function. You can get a roundabout idea (if it's very high or very low), but that's about it. You can't depend on the prediction output for certainty – Araymer Aug 26 '17 at 17:08
this one works for me. I have 4 classes. and I could use the probability to detect the confidence interval? – Nufa Oct 15 '18 at 02:50
1

I do agree that a probabilty gives some idea about uncertainty of outcome, but the Asker is referencing **model uncertainty**, which is like asking the question: what is the ucertainty of the proba-prediction by the model? You can have a very high certainty about a very 'ambivalent' probability, and vice-versa. For instance. You can be 99% certain about a probability being in the range [49.9 - 50.1%], for instance, when you flip a coin you know that probability if landing heads is ~50%. – Arjan Groen Aug 05 '19 at 12:33
@ArjanGroen I don't see how your comment contributes anything new here, we are talking about probabilities of a model, if you calibrate these probabilities, they actually mean what they should mean. – Dr. Snoopy Aug 05 '19 at 14:33
@MatiasValdenegro I agree about your comment on probabilities. What my comment aims to contribute to is clarifying the difference between the concepts uncertainty and probability. – Arjan Groen Aug 05 '19 at 14:47
2

Softmax probabilities are not a good metric to measure uncertainty. See https://arxiv.org/abs/1703.04977 – Gilfoyle Feb 04 '20 at 12:25

score 3 · Answer 4 · answered May 17 '19 at 03:11

3

A simpler way is to set training=True on any dropout layers you want to run during inference as well (essentially tells the layer to operate as if it's always in training mode - so it is always present for both training and inference).

import keras

inputs = keras.Input(shape=(10,))
x = keras.layers.Dense(3)(inputs)
outputs = keras.layers.Dropout(0.5)(x, training=True)

model = keras.Model(inputs, outputs)

Code above is from this issue.

answered May 17 '19 at 03:11

abagshaw

6,162
4
38
76

How can I use your approach if I use `keras.Sequential([])`? – Gilfoyle Feb 04 '20 at 12:28
@Samuel I'm late to the party, but maybe it helps others: **No, `training=True` cannot be set if you use the sequential API** - you need to use the functional API as shown in the example. That said, you may want to check out [uncertainty-wizard](https://github.com/testingautomated-usi/uncertainty-wizard), which allows achieving the same in the Sequential API (Disclaimer: I'm the author) – miwe Feb 10 '21 at 15:32

How to calculate prediction uncertainty using Keras?

4 Answers4

Linked