71

What is the difference between categorical_accuracy and sparse_categorical_accuracy in Keras? There is no hint in the documentation for these metrics, and by asking Dr. Google, I did not find answers for that either.

The source code can be found here:

def categorical_accuracy(y_true, y_pred):
    return K.cast(K.equal(K.argmax(y_true, axis=-1),
                          K.argmax(y_pred, axis=-1)),
                  K.floatx())


def sparse_categorical_accuracy(y_true, y_pred):
    return K.cast(K.equal(K.max(y_true, axis=-1),
                          K.cast(K.argmax(y_pred, axis=-1), K.floatx())),
                  K.floatx())
Nicolas Gervais
  • 33,817
  • 13
  • 115
  • 143
jcklie
  • 4,054
  • 3
  • 24
  • 42
  • 1
    Maybe this can help : https://stackoverflow.com/a/43546939/3374996 . Something to do with targets. I am not sure if by targets they mean the y_true, y_pred are sparse or the output of categorical accuracy is sparse. – Vivek Kumar Jun 11 '17 at 02:19
  • 2
    Pretty bad that this isn't in the docs nor the docstrings. – Denziloe Aug 04 '19 at 22:06

4 Answers4

89

So in categorical_accuracy you need to specify your target (y) as one-hot encoded vector (e.g. in case of 3 classes, when a true class is second class, y should be (0, 1, 0). In sparse_categorical_accuracy you need should only provide an integer of the true class (in the case from previous example - it would be 1 as classes indexing is 0-based).

Marcin Możejko
  • 39,542
  • 10
  • 109
  • 120
  • 1
    @MarcinMożejko I think you are wrong in your terminology - in sparse categorical accuracy you do not **need** to provide an integer - instead you **may** provide an array of length one with the index only - since keras chooses the max value from the array - but you may also provide an array of any length - for example of three results - and keras will choose the maximum value from this array and check if it corresponds to the index of the max value in y_pred – aviv Mar 31 '19 at 11:18
  • 2
    @aviv Follow up question - how is this different from just "accuracy"? Thanks. – user3303020 Apr 01 '19 at 15:21
  • 6
    @user3303020 when you tell keras to use `"accuracy"` keras is using the default accuracy which is `categorical_accuracy` – aviv Apr 03 '19 at 11:06
  • 1
    If you look at https://keras.io/api/metrics/accuracy_metrics/#accuracy-class, I think categorical_accuracy require label to be one hot encoded, while for accuracy the label can't be one hot encoded. My own tests confirms this. – user4918159 Aug 04 '20 at 16:08
53

Looking at the source

def categorical_accuracy(y_true, y_pred):
    return K.cast(K.equal(K.argmax(y_true, axis=-1),
                          K.argmax(y_pred, axis=-1)),
                  K.floatx())


def sparse_categorical_accuracy(y_true, y_pred):
    return K.cast(K.equal(K.max(y_true, axis=-1),
                          K.cast(K.argmax(y_pred, axis=-1), K.floatx())),
K.floatx())

categorical_accuracy checks to see if the index of the maximal true value is equal to the index of the maximal predicted value.

sparse_categorical_accuracy checks to see if the maximal true value is equal to the index of the maximal predicted value.

From Marcin's answer above the categorical_accuracy corresponds to a one-hot encoded vector for y_true.

Matti Lyra
  • 12,828
  • 8
  • 49
  • 67
  • 1
    Aren't we passing integers instead of one-hot vectors in sparse mode? why then it takes the maximum in the line K.max(y_true, axis=-1) ?? :/ shouldn't there be only one value in y_true I mean? – leo Mar 23 '20 at 15:58
17

The sparse_categorical_accuracy expects sparse targets:

[[0], [1], [2]]

For instance:

import tensorflow as tf

sparse = [[0], [1], [2]]
logits = [[.8, .1, .1], [.5, .3, .2], [.2, .2, .6]]

sparse_cat_acc = tf.metrics.SparseCategoricalAccuracy()
sparse_cat_acc(sparse, logits)
<tf.Tensor: shape=(), dtype=float64, numpy=0.6666666666666666>

categorical_accuracy expects one hot encoded targets:

[[1., 0., 0.],  [0., 1., 0.], [0., 0., 1.]]

For instance:

onehot = [[1., 0., 0.],  [0., 1., 0.], [0., 0., 1.]]
logits = [[.8, .1, .1], [.5, .3, .2], [.2, .2, .6]]

cat_acc = tf.metrics.CategoricalAccuracy()
cat_acc(sparse, logits)
<tf.Tensor: shape=(), dtype=float64, numpy=0.6666666666666666>
Innat
  • 16,113
  • 6
  • 53
  • 101
Nicolas Gervais
  • 33,817
  • 13
  • 115
  • 143
0

One difference that I just hit is the difference in the name of the metrics.

with categorical_accuracy, this worked:

mcp_save_acc = ModelCheckpoint('model_' + 'val_acc{val_accuracy:.3f}.hdf5', save_best_only=True, monitor='val_accuracy', mode='max')

but after switching to sparse_categorical accuracy, I now need this:

mcp_save_acc = ModelCheckpoint('model_' + 'val_acc{val_sparse_categorical_accuracy:.3f}.hdf5', save_best_only=True, monitor='val_sparse_categorical_accuracy', mode='max')

even though I still have metrics=['accuracy'] as an argument to my compile() function.

I kind of wish val_acc and/or val_accuracy just worked for all keras' inbuilt *_crossentropy losses.

craq
  • 1,441
  • 2
  • 20
  • 39