I am trying to convert a trained model from checkpoint file to tflite
. I am using tf.lite.LiteConverter
. The float conversion went fine with reasonable inference speed. But the inference speed of the INT8
conversion is very slow. I tried to debug by feeding in a very small network. I found that inference speed for INT8 model is generally slower than float model.
In the INT8 tflite file, I found some tensors called ReadVariableOp, which doesn't exist in TensorFlow's official mobilenet tflite model.
I wonder what causes the slowness of INT8 inference.