I am trying to quantizate a tensoflow graph stored in .pb. The input of the network is a matrix, that each row is normalized with mean 0 and std 1. I want to create a tensorflow-lite model quantizate to inteference faster. I do not know how to pass the inputs to the line conversion. Is it a just one value? a vector with 64 values? how is it passed?
The model is well converted without quantization.
tflite_convert \
--output_file=model_simple_weight_q.tflite \
--graph_def_file=model_simple.pb \
--inference_type=QUANTIZED_UINT8 \
--input_arrays=input \
--output_arrays=LogSoftmax \
--mean_values= # dont know \
--std_dev_values=# dont know
If I pass two single values, --mean_values=127 and --std_dev_values=128 for instance. Just to know what happens, I get the following error:
F tensorflow/lite/toco/graph_transformations/resolve_constant_gather.cc:108] Check failed: coords_array.data_type == ArrayDataType::kInt32 Only int32 indices are supported
Aborted (core dumped)