5

I'm new in the world of Tensorflow and I'm working on the simple example of mnist dataset classification. I would like to know how can I obtain other metrics (e.g precision, recall etc) in addition to accuracy and loss (and possibly to show them). Here's my code:

from __future__ import absolute_import, division, print_function, unicode_literals
import tensorflow as tf
from tensorflow.keras.callbacks import ModelCheckpoint
from tensorflow.keras.callbacks import TensorBoard
import os 

#load mnist dataset
mnist = tf.keras.datasets.mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()

#create and compile the model
model = tf.keras.models.Sequential([
  tf.keras.layers.Flatten(input_shape=(28, 28)), 
  tf.keras.layers.Dense(128, activation='relu'), 
  tf.keras.layers.Dropout(0.2), 
  tf.keras.layers.Dense(10, activation='softmax') 
])
model.summary()

model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

#model checkpoint (only if there is an improvement)

checkpoint_path = "logs/weights-improvement-{epoch:02d}-{accuracy:.2f}.hdf5"

cp_callback = ModelCheckpoint(checkpoint_path, monitor='accuracy',save_best_only=True,verbose=1, mode='max')

#Tensorboard
NAME = "tensorboard_{}".format(int(time.time())) #name of the model with timestamp
tensorboard = TensorBoard(log_dir="logs/{}".format(NAME))

#train the model
model.fit(x_train, y_train, callbacks = [cp_callback, tensorboard], epochs=5)

#evaluate the model
model.evaluate(x_test,  y_test, verbose=2)

Since I get only accuracy and loss, how can i get other metrics? Thank you in advance, I'm sorry if it is a simple question or If was already answered somewhere.

Timbus Calin
  • 13,809
  • 5
  • 41
  • 59
Diane.95
  • 129
  • 2
  • 7

5 Answers5

3

Starting from TensorFlow 2.X, precision and recall are both available as built-in metrics.

Therefore, you do not need to implement them by hand. In addition to this, they were removed before in Keras 2.X versions because they were misleading --- as they were being computed in a batch-wise manner, the global(true) values of precision and recall would be actually different.

You can have a look here : https://www.tensorflow.org/api_docs/python/tf/keras/metrics/Recall

Now they have a built-in accumulator, which ensures the correct calculation of those metrics.

model.compile(optimizer='adam',
              loss='binary_crossentropy',
              metrics=['accuracy',tf.keras.metrics.Precision(),tf.keras.metrics.Recall()])
Timbus Calin
  • 13,809
  • 5
  • 41
  • 59
  • Thank you! In this case my problem is not binary classification, since I have 10 classes, therefore I get an error. Is there a way to calculate precision/recall or other useful metrics for multi-class problems? Thank you – Diane.95 Mar 10 '20 at 18:54
  • @Diane.95 I will come back with an answer today, after the working day ends. – Timbus Calin Mar 16 '20 at 07:18
3

I am adding another answer because this is the cleanest way in order to compute these metrics correctly on your test set (as of 22nd of March 2020).

The first thing you need to do is to create a custom callback, in which you send your test data:

import tensorflow as tf
from tensorflow.keras.callbacks import Callback
from sklearn.metrics import classification_report 

class MetricsCallback(Callback):
    def __init__(self, test_data, y_true):
        # Should be the label encoding of your classes
        self.y_true = y_true
        self.test_data = test_data
        
    def on_epoch_end(self, epoch, logs=None):
        # Here we get the probabilities
        y_pred = self.model.predict(self.test_data))
        # Here we get the actual classes
        y_pred = tf.argmax(y_pred,axis=1)
        # Actual dictionary
        report_dictionary = classification_report(self.y_true, y_pred, output_dict = True)
        # Only printing the report
        print(classification_report(self.y_true,y_pred,output_dict=False)              
           

In your main, where you load your dataset and add the callbacks:

metrics_callback = MetricsCallback(test_data = my_test_data, y_true = my_y_true)
...
...
#train the model
model.fit(x_train, y_train, callbacks = [cp_callback, metrics_callback,tensorboard], epochs=5)

         
Timbus Calin
  • 13,809
  • 5
  • 41
  • 59
  • Thank you for your time and answer! I checked the code, and I create the object metrics_callback = MetricsCallback(test_data=x_train, y_true=y_train). When I fit the model though, I get an error: "name 'y_true' is not defined" on "report_dictionary = classification_report(y_true, y_pred, output_dict=True)" even if I'm passing the parameters :( – Diane.95 Mar 27 '20 at 11:03
  • We forgot to pass the 'self.' :D. So it should be self.y_true – Timbus Calin Mar 27 '20 at 11:08
  • Thank you Timbus! It was very helpful, you're the best. – Diane.95 Apr 02 '20 at 09:31
  • Happy to see you solve your problem. Could you please also upvote since you accepted the answer? Thank you and glad to have helped you. – Timbus Calin Apr 02 '20 at 09:36
  • @TimbusCalin, I am trying to use metrics_callback for my usecase, but when i am trying model.fit am getting "NameError:name 'classification_report' is not defined". Please can help me. – bsquare Apr 03 '20 at 13:29
  • You need to add the following line of code. : 'from sklearn.metrics import classification_report'. If you found my answer useful, please upvote my answer. – Timbus Calin Apr 03 '20 at 13:45
  • @TimbusCalin I'm sorry but my reputation is less than 15 therefore I cannot upvote :( – Diane.95 Apr 04 '20 at 16:00
  • @TimbusCalin Done! Thank you :) – Diane.95 Apr 08 '20 at 19:15
1

There is a list of available metrics in the Keras documentation. It includes recall, precision, etc.

For instance, recall:

model.compile('adam', loss='binary_crossentropy', 
    metrics=[tf.keras.metrics.Recall()])
Nicolas Gervais
  • 33,817
  • 13
  • 115
  • 143
  • The link that you provided leads to Keras documentation, which is different from the Keras documentation inside TensorFlow. In addition, the one that implemented them makes it clear that they are more misleading than helpful. – Timbus Calin Mar 10 '20 at 12:11
  • Creating a custom `callback` for other built-in metrics doesn't make any sense to me. I think this should be the accepted answer. – Innat Dec 10 '20 at 19:23
0

I could not get Timbus' answer to work and I found a very interesting explanation here.

It says: The meaning of 'accuracy' depends on the loss function. The one that corresponds to sparse_categorical_crossentropy is tf.keras.metrics.SparseCategoricalAccuracy(), not tf.metrics.Accuracy(). Which makes a lot of sense.

So what metrics you can use depend on the loss you chose. E.g. using the metric 'TruePositives' won't work in the case of SparseCategoricalAccuracy, because that loss means you're working with more than 1 class, which in turn means True Positives cannot be defined because it is only used in binary classification problems.

A loss like tf.keras.metrics.CategoricalCrossentropy() will work because it is designed with multiple classes in mind! Example:

from __future__ import absolute_import, division, print_function, unicode_literals
import tensorflow as tf
from tensorflow.keras.callbacks import ModelCheckpoint
from tensorflow.keras.callbacks import TensorBoard
import time
import os 

#load mnist dataset
mnist = tf.keras.datasets.mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()

#create and compile the model
model = tf.keras.models.Sequential([
  tf.keras.layers.Flatten(input_shape=(28, 28)), 
  tf.keras.layers.Dense(128, activation='relu'), 
  tf.keras.layers.Dropout(0.2), 
  tf.keras.layers.Dense(10, activation='softmax') 
])
model.summary()

# This will work because it makes sense
model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=[tf.keras.metrics.SparseCategoricalAccuracy(),
                       tf.keras.metrics.CategoricalCrossentropy()])

# This will not work because it isn't designed for the multiclass classification problem
model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=[tf.keras.metrics.SparseCategoricalAccuracy(),
                       tf.keras.metrics.TruePositives()])

#model checkpoint (only if there is an improvement)

checkpoint_path = "logs/weights-improvement-{epoch:02d}-{accuracy:.2f}.hdf5"

cp_callback = ModelCheckpoint(checkpoint_path,
                              monitor='accuracy',
                              save_best_only=True,
                              verbose=1,
                              mode='max')

#Tensorboard
NAME = "tensorboard_{}".format(int(time.time())) # name of the model with timestamp
tensorboard = TensorBoard(log_dir="logs/{}".format(NAME))

#train the model
model.fit(x_train, y_train, epochs=5)

#evaluate the model
model.evaluate(x_test,  y_test, verbose=2)

In my case the other 2 answers gave me shape mismatches.

Victor Sonck
  • 396
  • 3
  • 5
  • Thank you for your help, It's working! I tried other people's comments but I get an error because mine is not a binary classification problem, since I have 10 classes, therefore I get an error for precision and recall. Is there a chance to calculate them? Thank you – Diane.95 Mar 10 '20 at 18:52
  • @Diane.95 I will come back with a good answer today after the working day ends. – Timbus Calin Mar 16 '20 at 07:17
  • @Victor Sonck btw my answer above is working for multiclass now, if you want to check it (short reminder since I don't know where you are with the thread) :D – Timbus Calin Jul 20 '20 at 05:44
-1

For a list of supported metrics, see:

tf.keras Metrics

model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy',tf.keras.metrics.Precision(),tf.keras.metrics.Recall()])
Ram
  • 2,459
  • 1
  • 7
  • 14
  • That doesn't work because you cannot calculate precision and recall with sparse_categorical_crossentropy, since (I think) they are designed for binary classification problems. – Diane.95 Mar 10 '20 at 19:11