I have followed the Tensorflow2 documentation to convert my trained tf.estimator model to tflite model; in order to convert my model, first I had to save my model in saved_model format with an input_receiver_fn and then convert it with SELECT_OPS flag:
classifier = tf.estimator.LinearClassifier(n_classes=2, model_dir = classifier_dir, feature_columns=features)
classifier.train(input_fn = lambda: trian_fn(features = train_datas, labels = trian_labels))
serving_input_fn = tf.estimator.export.build_parsing_serving_input_receiver_fn(tf.feature_column.make_parse_example_spec(features))
classifier.export_saved_model(classifier_dir+"\saved_model", serving_input_fn)
converter = tf.lite.TFLiteConverter.from_saved_model(saved_model_dir = saved_model_dir , signature_keys=['serving_default'])
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS, tf.lite.OpsSet.SELECT_TF_OPS]
tflite_model = converter.convert()
with open('model.tflite', 'wb') as f:
f.write(tflite_model)
I wanted to run my tflite model on an ARM device without python support so I built the C++ interpreter shared libs with Bazel as it is explained in the documentation :
Cross-compile for armhf with Bazel
Select TensorFlow operators C++
My model has 3 input features but when I try to use the following guide for inferencing I get a segmentation fault. I used the following code to extract my model details:
interpreter = tf.lite.Interpreter(model_path="./model.tflite")
interpreter.allocate_tensors()
print("all ok")
# Print input shape and type
inputs = interpreter.get_input_details()
print('{} input(s):'.format(len(inputs)))
for i in range(0, len(inputs)):
print('{} {}'.format(inputs[i]['shape'], inputs[i]['dtype']))
# Print output shape and type
outputs = interpreter.get_output_details()
print('\n{} output(s):'.format(len(outputs)))
for i in range(0, len(outputs)):
print('{} {}'.format(outputs[i]['shape'], outputs[i]['dtype']))
I got the following output:
all ok
1 input(s):
[1] <class 'numpy.bytes_'>
2 output(s):
[1 2] <class 'numpy.bytes_'>
[1 2] <class 'numpy.float32'>
first few lines of the output of tflite::PrintInterpreterState(interpreter.get()) are:
INFO: Created TensorFlow Lite delegate for select TF ops.
INFO: TfLiteFlexDelegate delegate: 1 nodes delegated out of 25 nodes with 1 partitions.
Interpreter has 54 tensors and 26 nodes
Inputs: 0
Outputs: 38 34
Tensor 0 input_example_tensor kTfLiteString kTfLiteDynamic 0 bytes ( 0.0 MB) 1
The output illustrates that the input shape is not the same as the original model, also the input type is <class 'numpy.bytes_'> but the Tensorflow 2 model inputs are [numpy.float32, numpy.float32, numpy.float32]. my input dictionary for prediction in TF2 model is something like : {'feature0' : data0, 'feature1' : data1, 'feature2' : data2}
here is the Google Colab link to the Tensorflow model I didn't have previous experience with inferencing TensorFlow Lite models so I searched first and found out these related questions that helped me write below C++ code:
TensorFlow Lite C++ API example for inference
How to give multi-dimensional inputs to tflite via C++ API
I tried to fill the input buffer with a vector of zeros but it was without success. Here is my C++ code to load a tflite model and feed it inputs for prediction. can someone please point me to the right direction since I could not find any examples or related documentation for feeding inputs to converted tf.estimator with a serving_input_fn.
#include <cstdio>
#include "tensorflow/lite/interpreter.h"
#include "tensorflow/lite/kernels/register.h"
#include "tensorflow/lite/model.h"
#include "tensorflow/lite/optional_debug_tools.h"
int main()
{
// Load model
std::unique_ptr<tflite::FlatBufferModel> model = tflite::FlatBufferModel::BuildFromFile("model.tflite");
// Build the interpreter with the InterpreterBuilder.
tflite::ops::builtin::BuiltinOpResolver resolver;
tflite::InterpreterBuilder builder(*model, resolver);
std::unique_ptr<tflite::Interpreter> interpreter;
builder(&interpreter);
tflite::PrintInterpreterState(interpreter.get());
// Allocate tensor buffers.
interpreter->AllocateTensors();
printf("=== Pre-invoke Interpreter State ===\n");
tflite::PrintInterpreterState(interpreter.get());
// Fill input buffers
std::vector<float> tensor(3, 0); //Vector of zeros
int input = interpreter->inputs()[0];
float* input_data_ptr = interpreter->typed_input_tensor<float>(input);
for(int i = 0; i < 3; ++i)
{
*(input_data_ptr) = (float)tensor[i];
input_data_ptr++;
}
// Run inference
interpreter->Invoke();
printf("\n\n=== Post-invoke Interpreter State ===\n");
return 0;
}
EDIT 1:
I also asked this question in Tensorflow's GitHub and got a comment mentioning that I have to feed my inputs in the form of an "example proto", now the problem is reduced to what is an "example proto" and how can one feed inputs to a tflite model in from of an example proto?