1

I would like to convert a TRT optimized frozen model to saved model for tensorflow serving. Are there any suggestions or sources to share?

Or are there any other ways to deploy a TRT optimized model in tensorflow serving?

Thanks.

zzachimonde
  • 455
  • 3
  • 8
  • Hi , .Please take the time to read to see https://stackoverflow.com/help/how-to-ask – core114 Sep 25 '18 at 03:33
  • This answer might help if you are looking for an implementation: https://stackoverflow.com/a/44329200/7977464 – isydmr Dec 20 '19 at 12:48

1 Answers1

1

Assuming you have a TRT optimized model (i.e., the model is represented already in UFF) you can simply follow the steps outlined here: https://docs.nvidia.com/deeplearning/sdk/tensorrt-developer-guide/index.html#python_topics. Pay special attention to section 3.3 and 3.4, since in these sections you actually build the TRT engine and then save it to a file for later use. From that point forward, you can just re-use the serialized engine (aka. a PLAN file) to do inference.

Basically, the workflow looks something like this:

  1. Build/train model in TensorFlow.
  2. Freeze model (you get a protobuf representation).
  3. Convert model to UFF so TensorRT can understand it.
  4. Use the UFF representation to build a TensorRT engine.
  5. Serialize the engine and save it to a PLAN file.

Once those steps are done (and you should have sufficient example code in the link I provided) you can just load the PLAN file and re-use it over and over again for inference operations.

If you are still stuck, there is an excellent example that is installed by default here: /usr/src/tensorrt/samples/python/end_to_end_tensorflow_mnist. You should be able to use that example to see how to get to the UFF format. Then you can just combine that with the example code found in the link I provided.

It'sPete
  • 5,083
  • 8
  • 39
  • 72
  • Thanks for your replay. But it seems that you have misunderstood what I need. Tensorflow serving is a paltfrom to run tensorflow models, which is based on C++. And it only supports saved-model format. But TRT optimized model is frozen format. I just want to know how to convert TRT model to saved-model and could then be loaded in tensorflow serving. – zzachimonde Oct 15 '18 at 09:51