Example of inferencing a Tensorflow lite model with parsing_serving_input_receiver_fn using C++ API

Question

I have followed the Tensorflow2 documentation to convert my trained tf.estimator model to tflite model; in order to convert my model, first I had to save my model in saved_model format with an input_receiver_fn and then convert it with SELECT_OPS flag:

classifier = tf.estimator.LinearClassifier(n_classes=2, model_dir = classifier_dir, feature_columns=features)
classifier.train(input_fn = lambda: trian_fn(features = train_datas, labels = trian_labels))

serving_input_fn = tf.estimator.export.build_parsing_serving_input_receiver_fn(tf.feature_column.make_parse_example_spec(features))

classifier.export_saved_model(classifier_dir+"\saved_model", serving_input_fn)

converter = tf.lite.TFLiteConverter.from_saved_model(saved_model_dir = saved_model_dir , signature_keys=['serving_default']) 
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS, tf.lite.OpsSet.SELECT_TF_OPS]
tflite_model = converter.convert()
with open('model.tflite', 'wb') as f:
  f.write(tflite_model)

I wanted to run my tflite model on an ARM device without python support so I built the C++ interpreter shared libs with Bazel as it is explained in the documentation :

Cross-compile for armhf with Bazel

Select TensorFlow operators C++

My model has 3 input features but when I try to use the following guide for inferencing I get a segmentation fault. I used the following code to extract my model details:

interpreter = tf.lite.Interpreter(model_path="./model.tflite")
interpreter.allocate_tensors()
print("all ok")
# Print input shape and type
inputs = interpreter.get_input_details()
print('{} input(s):'.format(len(inputs)))
for i in range(0, len(inputs)):
    print('{} {}'.format(inputs[i]['shape'], inputs[i]['dtype']))

# Print output shape and type
outputs = interpreter.get_output_details()
print('\n{} output(s):'.format(len(outputs)))
for i in range(0, len(outputs)):
    print('{} {}'.format(outputs[i]['shape'], outputs[i]['dtype']))

I got the following output:

all ok
1 input(s):
[1] <class 'numpy.bytes_'>

2 output(s):
[1 2] <class 'numpy.bytes_'>
[1 2] <class 'numpy.float32'>

first few lines of the output of tflite::PrintInterpreterState(interpreter.get()) are:

INFO: Created TensorFlow Lite delegate for select TF ops.
INFO: TfLiteFlexDelegate delegate: 1 nodes delegated out of 25 nodes with 1 partitions.

Interpreter has 54 tensors and 26 nodes
Inputs: 0
Outputs: 38 34

Tensor   0 input_example_tensor kTfLiteString  kTfLiteDynamic          0 bytes ( 0.0 MB)  1

The output illustrates that the input shape is not the same as the original model, also the input type is <class 'numpy.bytes_'> but the Tensorflow 2 model inputs are [numpy.float32, numpy.float32, numpy.float32]. my input dictionary for prediction in TF2 model is something like : {'feature0' : data0, 'feature1' : data1, 'feature2' : data2}

here is the Google Colab link to the Tensorflow model I didn't have previous experience with inferencing TensorFlow Lite models so I searched first and found out these related questions that helped me write below C++ code:

TensorFlow Lite C++ API example for inference

How to give multi-dimensional inputs to tflite via C++ API

I tried to fill the input buffer with a vector of zeros but it was without success. Here is my C++ code to load a tflite model and feed it inputs for prediction. can someone please point me to the right direction since I could not find any examples or related documentation for feeding inputs to converted tf.estimator with a serving_input_fn.

#include <cstdio>
#include "tensorflow/lite/interpreter.h"
#include "tensorflow/lite/kernels/register.h"
#include "tensorflow/lite/model.h"
#include "tensorflow/lite/optional_debug_tools.h"

int main()
{
  // Load model
      std::unique_ptr<tflite::FlatBufferModel> model = tflite::FlatBufferModel::BuildFromFile("model.tflite");
      
  // Build the interpreter with the InterpreterBuilder.
  tflite::ops::builtin::BuiltinOpResolver resolver;
  tflite::InterpreterBuilder builder(*model, resolver);
  std::unique_ptr<tflite::Interpreter> interpreter;
  builder(&interpreter);
  tflite::PrintInterpreterState(interpreter.get());
  
  // Allocate tensor buffers.
  interpreter->AllocateTensors();
  printf("=== Pre-invoke Interpreter State ===\n");
  tflite::PrintInterpreterState(interpreter.get());

  // Fill input buffers
  std::vector<float> tensor(3, 0);  //Vector of zeros
  int input = interpreter->inputs()[0];
  float* input_data_ptr = interpreter->typed_input_tensor<float>(input);
  for(int i = 0; i < 3; ++i)
  {
    *(input_data_ptr) = (float)tensor[i];
    input_data_ptr++;
  }
  // Run inference
  interpreter->Invoke();
  printf("\n\n=== Post-invoke Interpreter State ===\n");
  
  return 0;
}

EDIT 1:

I also asked this question in Tensorflow's GitHub and got a comment mentioning that I have to feed my inputs in the form of an "example proto", now the problem is reduced to what is an "example proto" and how can one feed inputs to a tflite model in from of an example proto?

Github issue link

how to generate TFlite headers when compiling for c++? any leads would be really helpful. — H.Singh, May 27 '21 at 11:00
@Isshed you can build tensorflow from github and use the executable in the cmake to be used in your c/c++ project. — Aparajit Garg, Mar 01 '22 at 12:43

Example of inferencing a Tensorflow lite model with parsing_serving_input_receiver_fn using C++ API

EDIT 1:

0 Answers0