I am trying to teach image classification model to define a number characteristic from an image. I am sure that SparseCategoricalCrossentropy loss function doesn't work for me, as for training I need to penalize big differences more than small ones. Ideally I would like to use Mean Squared Error loss function.
I use TensorFlow tutorial to prepare the model - https://www.tensorflow.org/tutorials/images/classification.
Class names are numbers for me, I tried the following options:
- ['00', '01', '02', '03', '04', '05', '06', '07', '08', '09', '10', '11', '12']
- ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9', '10', '11', '12']
The only change I made against tutorial (except the dataset) is exchanging SparseCategoricalCrossentropy loss function to 'mean_squared_error'.
But the loss function clearly doesn't work for me. It returns values, that gets smaller with training, but accuracy is never more than 5%, and it even goes down as loss value becomes smaller. Results also do not make sense. The data is fine, I can easily achieve 95% accuracy with SparseCategoricalCrossentropy loss function. What am I missing?
UPDATE: I think what I really need is a way to define regression problem in TensorFlow using images labeled with numbers.