Keras - Difference between categorical_accuracy and sparse_categorical_accuracy

Question

What is the difference between categorical_accuracy and sparse_categorical_accuracy in Keras? There is no hint in the documentation for these metrics, and by asking Dr. Google, I did not find answers for that either.

The source code can be found here:

def categorical_accuracy(y_true, y_pred):
    return K.cast(K.equal(K.argmax(y_true, axis=-1),
                          K.argmax(y_pred, axis=-1)),
                  K.floatx())


def sparse_categorical_accuracy(y_true, y_pred):
    return K.cast(K.equal(K.max(y_true, axis=-1),
                          K.cast(K.argmax(y_pred, axis=-1), K.floatx())),
                  K.floatx())

Maybe this can help : https://stackoverflow.com/a/43546939/3374996 . Something to do with targets. I am not sure if by targets they mean the y_true, y_pred are sparse or the output of categorical accuracy is sparse. — Vivek Kumar, Jun 11 '17 at 02:19

score 89 · Answer 1 · answered Jun 11 '17 at 12:42

89

So in categorical_accuracy you need to specify your target (y) as one-hot encoded vector (e.g. in case of 3 classes, when a true class is second class, y should be (0, 1, 0). In sparse_categorical_accuracy you need should only provide an integer of the true class (in the case from previous example - it would be 1 as classes indexing is 0-based).

answered Jun 11 '17 at 12:42

Marcin Możejko

39,542
10
109
120

1

@MarcinMożejko I think you are wrong in your terminology - in sparse categorical accuracy you do not **need** to provide an integer - instead you **may** provide an array of length one with the index only - since keras chooses the max value from the array - but you may also provide an array of any length - for example of three results - and keras will choose the maximum value from this array and check if it corresponds to the index of the max value in y_pred – aviv Mar 31 '19 at 11:18
2

@aviv Follow up question - how is this different from just "accuracy"? Thanks. – user3303020 Apr 01 '19 at 15:21
6

@user3303020 when you tell keras to use `"accuracy"` keras is using the default accuracy which is `categorical_accuracy` – aviv Apr 03 '19 at 11:06
1

If you look at https://keras.io/api/metrics/accuracy_metrics/#accuracy-class, I think categorical_accuracy require label to be one hot encoded, while for accuracy the label can't be one hot encoded. My own tests confirms this. – user4918159 Aug 04 '20 at 16:08

Matti Lyra · Accepted Answer · 2017-06-11T13:09:38.013

Looking at the source

def categorical_accuracy(y_true, y_pred):
    return K.cast(K.equal(K.argmax(y_true, axis=-1),
                          K.argmax(y_pred, axis=-1)),
                  K.floatx())


def sparse_categorical_accuracy(y_true, y_pred):
    return K.cast(K.equal(K.max(y_true, axis=-1),
                          K.cast(K.argmax(y_pred, axis=-1), K.floatx())),
K.floatx())

categorical_accuracy checks to see if the index of the maximal true value is equal to the index of the maximal predicted value.

sparse_categorical_accuracy checks to see if the maximal true value is equal to the index of the maximal predicted value.

From Marcin's answer above the categorical_accuracy corresponds to a one-hot encoded vector for y_true.

Aren't we passing integers instead of one-hot vectors in sparse mode? why then it takes the maximum in the line K.max(y_true, axis=-1) ?? :/ shouldn't there be only one value in y_true I mean? — leo, Mar 23 '20 at 15:58

score 17 · Answer 3 · edited Sep 02 '21 at 14:44

The sparse_categorical_accuracy expects sparse targets:

[[0], [1], [2]]

For instance:

import tensorflow as tf

sparse = [[0], [1], [2]]
logits = [[.8, .1, .1], [.5, .3, .2], [.2, .2, .6]]

sparse_cat_acc = tf.metrics.SparseCategoricalAccuracy()
sparse_cat_acc(sparse, logits)

<tf.Tensor: shape=(), dtype=float64, numpy=0.6666666666666666>

categorical_accuracy expects one hot encoded targets:

[[1., 0., 0.],  [0., 1., 0.], [0., 0., 1.]]

For instance:

onehot = [[1., 0., 0.],  [0., 1., 0.], [0., 0., 1.]]
logits = [[.8, .1, .1], [.5, .3, .2], [.2, .2, .6]]

cat_acc = tf.metrics.CategoricalAccuracy()
cat_acc(sparse, logits)

<tf.Tensor: shape=(), dtype=float64, numpy=0.6666666666666666>

score 0 · Answer 4 · answered Jul 28 '21 at 03:43

One difference that I just hit is the difference in the name of the metrics.

with categorical_accuracy, this worked:

mcp_save_acc = ModelCheckpoint('model_' + 'val_acc{val_accuracy:.3f}.hdf5', save_best_only=True, monitor='val_accuracy', mode='max')

but after switching to sparse_categorical accuracy, I now need this:

mcp_save_acc = ModelCheckpoint('model_' + 'val_acc{val_sparse_categorical_accuracy:.3f}.hdf5', save_best_only=True, monitor='val_sparse_categorical_accuracy', mode='max')

even though I still have metrics=['accuracy'] as an argument to my compile() function.

I kind of wish val_acc and/or val_accuracy just worked for all keras' inbuilt *_crossentropy losses.

This is interesting, useful and of practical value, but not related to the question. It should at best be a comment. — Volker Siegel, Jan 13 '22 at 04:12

Keras - Difference between categorical_accuracy and sparse_categorical_accuracy

4 Answers4

Linked