I am developing an end-to-end training and quantization aware traing example. Using the CIFAR 10 dataset, I load a pretrained MobilenetV2 model and then use the code from the TensorFlow Guide to quantize my model. After the whole process finishes properly, I get the following results:
Quant TFLite test_accuracy: 0.94462
Quant TF test accuracy: 0.744700014591217
TF test accuracy: 0.737500011920929
I wonder, how is this possible? Quantization is supposed to reduce accuracy a little bit.
I have noticed that in the TensorFlow's Guide example, accuracy is also enhanced a little bit, but very little compared to my example. To be more specific, when running this code which uses mnist dataset, I get the results below, which are acceptable by the developers of TensorFlow, as they mention that there is no change in accuracy.
Quant TFLite test_accuracy: 0.9817
Quant TF test accuracy: 0.9815
TF test accuracy: 0.9811
Note that I haven't changed the code I attached from the TensorFlow Guide, I just use a different dataset and model.