Run tensorflow model in CPP

Question

I trained my model using tf.keras. I convert this model to '.pb' by,

import os
import tensorflow as tf
from tensorflow.keras import backend as K
K.set_learning_phase(0)

from tensorflow.keras.models import load_model
model = load_model('model_checkpoint.h5')
model.save('model_tf2', save_format='tf')

This creates a folder 'model_tf2' with 'assets', varaibles, and saved_model.pb

I'm trying to load this model in cpp. Referring to many other posts (mainly, Using Tensorflow checkpoint to restore model in C++), I am now able to load the model.

    RunOptions run_options;
    run_options.set_timeout_in_ms(60000);
    SavedModelBundle model;
    auto status = LoadSavedModel(SessionOptions(), run_options, model_dir_path, tags, &model);
    if (!status.ok()) {
        std::cerr << "Failed: " << status1;
        return -1;
    }

The above screenshot shows that the model was loaded.

I have the following questions

How do I do a forward pass through the model?
I understand 'tag' can be gpu, serve, train.. What is the difference between serve and gpu?
I don't understand the first 2 arguments to LoadSavedModel i.e. session options and run options. What purpose do they serve? Also, could you help me understand with a syntactical example? I have set run_options by looking at another stackoverflow post, however I don't understand its purpose.

Thank you!! :)

As this question appeared several times: https://github.com/PatWie/tensorflow-cmake/blob/master/examples/keras/inference.cpp — Patwie, Nov 24 '19 at 10:45
@ashwin Did you succeed in doing a forward pass? Care to post in a new answer your code? — WurmD, Apr 21 '20 at 16:02

score 1 · Answer 1 · answered Apr 14 '20 at 08:49

Code to perform forward pass through the model, as mentioned by Patwie in the comments, is given below:

#include <tensorflow/core/protobuf/meta_graph.pb.h>
#include <tensorflow/core/public/session.h>
#include <tensorflow/core/public/session_options.h>
#include <iostream>
#include <string>

typedef std::vector<std::pair<std::string, tensorflow::Tensor>> tensor_dict;

/**
 * @brief load a previous store model
 * @details [long description]
 *
 * in Python run:
 *
 *    saver = tf.train.Saver(tf.global_variables())
 *    saver.save(sess, './exported/my_model')
 *    tf.train.write_graph(sess.graph, '.', './exported/graph.pb, as_text=False)
 *
 * this relies on a graph which has an operation called `init` responsible to
 * initialize all variables, eg.
 *
 *    sess.run(tf.global_variables_initializer())  # somewhere in the python
 * file
 *
 * @param sess active tensorflow session
 * @param graph_fn path to graph file (eg. "./exported/graph.pb")
 * @param checkpoint_fn path to checkpoint file (eg. "./exported/my_model",
 * optional)
 * @return status of reloading
 */
tensorflow::Status LoadModel(tensorflow::Session *sess, std::string graph_fn,
                             std::string checkpoint_fn = "") {
  tensorflow::Status status;

  // Read in the protobuf graph we exported
  tensorflow::MetaGraphDef graph_def;
  status = ReadBinaryProto(tensorflow::Env::Default(), graph_fn, &graph_def);
  if (status != tensorflow::Status::OK()) return status;

  // create the graph
  status = sess->Create(graph_def.graph_def());
  if (status != tensorflow::Status::OK()) return status;

  // restore model from checkpoint, iff checkpoint is given
  if (checkpoint_fn != "") {
    tensorflow::Tensor checkpointPathTensor(tensorflow::DT_STRING,
                                            tensorflow::TensorShape());
    checkpointPathTensor.scalar<std::string>()() = checkpoint_fn;

    tensor_dict feed_dict = {
        {graph_def.saver_def().filename_tensor_name(), checkpointPathTensor}};
    status = sess->Run(feed_dict, {}, {graph_def.saver_def().restore_op_name()},
                       nullptr);
    if (status != tensorflow::Status::OK()) return status;
  } else {
    // virtual Status Run(const std::vector<std::pair<string, Tensor> >& inputs,
    //                  const std::vector<string>& output_tensor_names,
    //                  const std::vector<string>& target_node_names,
    //                  std::vector<Tensor>* outputs) = 0;
    status = sess->Run({}, {}, {"init"}, nullptr);
    if (status != tensorflow::Status::OK()) return status;
  }

  return tensorflow::Status::OK();
}

int main(int argc, char const *argv[]) {
  const std::string graph_fn = "./exported/my_model.meta";
  const std::string checkpoint_fn = "./exported/my_model";

  // prepare session
  tensorflow::Session *sess;
  tensorflow::SessionOptions options;
  TF_CHECK_OK(tensorflow::NewSession(options, &sess));
  TF_CHECK_OK(LoadModel(sess, graph_fn, checkpoint_fn));

  // prepare inputs
  tensorflow::TensorShape data_shape({1, 2});
  tensorflow::Tensor data(tensorflow::DT_FLOAT, data_shape);

  // same as in python file
  auto data_ = data.flat<float>().data();
  data_[0] = 42;
  data_[1] = 43;

  tensor_dict feed_dict = {
      {"input_plhdr", data},
  };

  std::vector<tensorflow::Tensor> outputs;
  TF_CHECK_OK(
      sess->Run(feed_dict, {"sequential/Output_1/Softmax:0"}, {}, &outputs));

  std::cout << "input           " << data.DebugString() << std::endl;
  std::cout << "output          " << outputs[0].DebugString() << std::endl;

  return 0;
}

The tags Serve and GPU can be used together if we want to perform inference on a Model using GPU.
The argument session_options in C++ is equivalent to tf.ConfigProto(allow_soft_placement=True, log_device_placement=True),

which means that, If allow_soft_placement is true, an op will be placed on CPU if

(i) there's no GPU implementation for the OP (or)

(ii) no GPU devices are known or registered (or)

(iii) need to co-locate with reftype input(s) which are from CPU.

The argument run_options is used if we want to use the Profiler, i.e., to extract runtime statistics of the graph execution. It adds information about the time of execution and memory consumption to your event files and allow you to see this information in tensorboard.
Syntax to use session_options and run_options is given in the code mentioned above.

Tensorflow Support, what version of Tensorflow was used in this answer? I cannot find a tensorflow::Session in the 2.1 C++ documentation https://www.tensorflow.org/api_docs/cc/ — WurmD, Apr 22 '20 at 13:18
The question involves using LoadSavedModel and the answer is based on using ReadBinaryProto which are not equivalent and serve different purposes. — fisakhan, Sep 11 '20 at 13:10

score 0 · Answer 2 · answered Apr 22 '20 at 17:49

This worked well with TF1.5

load graph function

Status LoadGraph(const tensorflow::string& graph_file_name,
    std::unique_ptr<tensorflow::Session>* session, tensorflow::SessionOptions options) {
    tensorflow::GraphDef graph_def;
    Status load_graph_status =
        ReadBinaryProto(tensorflow::Env::Default(), graph_file_name, &graph_def);
    if (!load_graph_status.ok()) {
        return tensorflow::errors::NotFound("Failed to load compute graph at '",
            graph_file_name, "'");
    }
    //session->reset(tensorflow::NewSession(tensorflow::SessionOptions()));
    session->reset(tensorflow::NewSession(options));
    Status session_create_status = (*session)->Create(graph_def);
    if (!session_create_status.ok()) {
        return session_create_status;
    }
    return Status::OK();
}

Call the load graph function with path to .pb model and other session configuration. Once the model is loaded you can do forward pass by calling Run

Status load_graph_status = LoadGraph(graph_path, &session_fpass, options);

if (!load_graph_status.ok()) {
    LOG(ERROR) << load_graph_status;
    return -1;
}


std::vector<tensorflow::Tensor> outputs;

Status run_status = session_fpass->Run({ {input_layer, image_in} },
    { output_layer1}, { output_layer1}, &outputs);

if (!run_status.ok()) {
    LOG(ERROR) << "Running model failed: " << run_status;
    return -1;
}

The question involves using LoadSavedModel and the answer is based on using ReadBinaryProto which are not equivalent and serve different purposes. — fisakhan, Sep 11 '20 at 13:09
did not realize while posting this question that model could be loaded in c++ for "serving" or "stand-alone". The question was keeping in mind the "stand-alone" approach where in I load the model every time I do a forward pass. For this approach this solution works. As I understand, the SavedModel version of the model could be used for serving. — Ashwin Kannan, Sep 12 '20 at 14:16

Run tensorflow model in CPP

2 Answers2