Google Coral Edge TPU compiled model - Inference always almost the same

Question

I'm trying to get a Mobilenetv2 model (retrained last layers to my dataset) to run on the Google edge TPU Coral. I'm able to quantize and compile the model with the 'edgetpu_compiler' (followed this page https://coral.withgoogle.com/docs/edgetpu/compiler/#usage). But when I run inference in the TPU I'm getting a similar output for very different input images.

I've used 'tflite_convert' tool to quantize the model like this:

tflite_convert --output_file=./model.tflite 
--keras_model_file=models/MobileNet2_best-val-acc.h5 --output_format=TFLITE
--inference_type=QUANTIZED_UINT8 --default_ranges_min=0 --default_ranges_max=6 
--std_dev_values=127 --mean_values=128 --input_shapes=1,482,640,3 --input_arrays=input_2

Then I've used 'edgetpu_compiler' tool to compile it for the TPU:

sudo edgetpu_compiler  model.tflite
Edge TPU Compiler version 2.0.258810407
INFO: Initialized TensorFlow Lite runtime.

Model compiled successfully in 557 ms.

Input model: model.tflite
Input size: 3.44MiB
Output model: model_edgetpu.tflite
Output size: 4.16MiB
On-chip memory available for caching model parameters: 4.25MiB
On-chip memory used for caching model parameters: 3.81MiB
Off-chip memory used for streaming uncached model parameters: 0.00B
Number of Edge TPU subgraphs: 1
Total number of operations: 71
Operation log: model_edgetpu.log
See the operation log file for individual operation details.

Then when I run inference using this code:

...
labels = ["Class1", "Class2", "Class3", "Class4"]
results = engine.ClassifyWithImage(img, top_k=4)
for result in results:
    print('---------------------------')
    print(labels[result[0]])
    print('Score : ', result[1])

The output is like this (assuming labels ["Class1", "Class2", "Class3", "Class4"]):

---------------------------
Class1
Score :  0.2890625
---------------------------
Class2
Score :  0.26953125
---------------------------
Class3
Score :  0.21875
---------------------------
Class4
Score :  0.21875

It is almost the same for any input image, and usually, the first two classes have the same (or very very similar) value (the same for the 3rd and 4th) as seen in the example shown above. It should be 0.99 for one class (as it is in the .h5 model or even in the .tflite model without quantization)

Can it be something with the parameters -default_ranges_min=0 --default_ranges_max=6 --std_dev_values=127 --mean_values=128? How can I calculate them?

Edit 1:

Using the answer from this post I've tried to quantize the model using both --std_dev_values=127 --mean_values=128 and --std_dev_values=255 --mean_values=0, but I'm still getting garbage inference. As mobilenet2 uses relu6, default ranges shoud be -default_ranges_min=0 --default_ranges_max=6 right?

The model is a MobileNetv2 retrained, the input is an RGB image (3 channels), the input shape is 1,482,640,3.

Hello Paulo, did you solve the issue? I am having similar problem with mobilenet V1. not sure how to solve it. just posted a question here : https://stackoverflow.com/questions/57869149/post-training-quantization-for-mobilenet-v1-not-working — MMH, Sep 10 '19 at 11:37
Not yet, meanwhile I'm trying to retrain an already quantized mobilenetv1 model, following this guide: https://coral.withgoogle.com/docs/edgetpu/retrain-classification-ondevice/#retrain-the-base-mobilenet-model — Paulo Ribeiro, Sep 10 '19 at 15:44
Coral uses pillow to downscale input image before inference. Different pillow versions could produce slightly different input tensors. Can you please check pillow version on your platform? Can you also try to downscale the input image to the model input size and test with that image ? — Manoj, Sep 17 '19 at 05:01

Alan Chiao · Accepted Answer · 2019-09-14T01:12:20.210

1

From your comment on mobilenetv1, it sounds like you are taking a retrained float model and converting it to TFLite. You intended to quantize it by running the command that you listed.

I'd recommend that you take a closer look at the TensorFlow lite docs. In general, there are two ways of quantization (doing it during training time and doing it post-training). The approach you seem to want to take is post-training.

The proper way of doing it post-training for something like Coral is to follow this guide (https://www.tensorflow.org/lite/performance/post_training_integer_quant), as recommended by the Coral team here (https://coral.withgoogle.com/news/updates-07-2019/).

The flow you're using above is more geared towards training time quantization.

edited Sep 14 '19 at 01:12

answered Sep 14 '19 at 00:42

Alan Chiao

161
3

The comment on mobilenetv1 is inned an approach of training quantization, but as I said in the comment it is an alternative to the post-training quantization that I've described in the post. I still want to find what I'm doing wrong on post training quantization of the mobilenetv2 re-trained model that I already have. – Paulo Ribeiro Sep 16 '19 at 09:56
To clarify: Using tflite_convert by setting inference_type=QUANTIZED_UINT8, default_ranges_min, default_ranges_max=6, std_dev_values=127 --mean_values=128 is not how post-training quantization is done. That is the path for converting a training-time quantization model, which you don't have for mobilenet v1. You only have a float model. In the path you're taking, given a float model, you're effectively hardcoding all aspects of quantization, without any computed quantization parameters using real data or inputs. This doesn't work. The proper way is the guide I pointed to. – Alan Chiao Sep 16 '19 at 17:48
Edit: "which you don't have for mobilenet v2" – Alan Chiao Sep 19 '19 at 03:44
I've tried to follow thaht guide before, and ended up with this error: https://stackoverflow.com/questions/57234308/edge-tpu-compiler-error-quantized-dimension-must-be-in-range-0-1-was-3 – Paulo Ribeiro Sep 19 '19 at 10:42

Google Coral Edge TPU compiled model - Inference always almost the same

1 Answers1