Mean Squared Error for image classification model in TensorFlow

Question

I am trying to teach image classification model to define a number characteristic from an image. I am sure that SparseCategoricalCrossentropy loss function doesn't work for me, as for training I need to penalize big differences more than small ones. Ideally I would like to use Mean Squared Error loss function.

I use TensorFlow tutorial to prepare the model - https://www.tensorflow.org/tutorials/images/classification.

Class names are numbers for me, I tried the following options:

['00', '01', '02', '03', '04', '05', '06', '07', '08', '09', '10', '11', '12']
['0', '1', '2', '3', '4', '5', '6', '7', '8', '9', '10', '11', '12']

The only change I made against tutorial (except the dataset) is exchanging SparseCategoricalCrossentropy loss function to 'mean_squared_error'.

But the loss function clearly doesn't work for me. It returns values, that gets smaller with training, but accuracy is never more than 5%, and it even goes down as loss value becomes smaller. Results also do not make sense. The data is fine, I can easily achieve 95% accuracy with SparseCategoricalCrossentropy loss function. What am I missing?

UPDATE: I think what I really need is a way to define regression problem in TensorFlow using images labeled with numbers.

MSE is **not** a classification loss. Please see [What function defines accuracy in Keras when the loss is mean squared error (MSE)?](https://stackoverflow.com/questions/48775305/what-function-defines-accuracy-in-keras-when-the-loss-is-mean-squared-error-mse/48788577#48788577) — desertnaut, Sep 02 '20 at 10:34
I saw it. It seems, I don't need classification loss. What do I need then? Is it possible to use regression loss on image models? — kharit, Sep 02 '20 at 10:39
What do you mean "image models"? Loss is defined by the kind of *problem* you are trying to solve; if it is a *classification* problem, you cannot use MSE or other losses appropriate for *regression*. — desertnaut, Sep 02 '20 at 10:41
I think what I need is a way to define regression problem in TensorFlow using images labeled with numbers. — kharit, Sep 02 '20 at 10:56
This way, the error between a "9" and a "2" would be greater than between "9" and "8". If this is what you want, sounds indeed like a regression problem. — desertnaut, Sep 02 '20 at 11:01

score 0 · Accepted Answer · answered Sep 02 '20 at 12:38

Turns out it is quite easy to turn image classification problem into a regression problem. Against tutorial referenced in question I had to make the following changes:

Different dataset with numbers as 'classes' (folder names).
Changed loss function to Mean Squared Error or other loss function suitable for regression.
Made the last layer for model with just 1 neurone instead of number of classes (and without softmax):
```
...
layers.Dense(128, activation='relu'),
layers.Dense(1)    # changed from num_classes to 1
```

Changed interpretation of prediction results:

 ...
 predictions = model.predict(img_array)
 # score = tf.nn.softmax(predictions[0])    # correct for classification, but not regression
 score = predictions.flatten()[0]    # correct result for regression
 ...

But your basic premise by which you begin your question, "*I am trying to teach image **classification** model*", does not hold anymore, does it? — desertnaut, Sep 02 '20 at 14:50
You are right. If I knew this stuff, I wouldn't ask here, — I would just figure out myself, as it is faster. I don't want to update the question, as I think it is clear and I saw a number of people looking for an answer for a similar question and not receiving it. I hope it will help someone. — kharit, Sep 02 '20 at 15:30

Mean Squared Error for image classification model in TensorFlow

1 Answers1