How to deploy tensorflow model on spark to do inference only

Question

I want to deploy a big model, e.g. bert, on spark to do inference since I don't have enough GPUs. Now I have two problems.

I export the model to be pb format and load the model using the SavedModelBundle interface.

SavedModelBundle bundle=SavedModelBundle.load("E:\\pb\\1561992264","serve");

However, I can't find a way to load a pb model for hdfs filesystem path

The spark environment's Glibc version isn't compatible with the tensorflow version I trained the model. Anyway to go around this?

I am not sure this is a good way to serving a tensorflow model on spark. Any other suggestions are appreciated!

I suggest using TensorFlowOnSpark which integrates training and inference totally on spark: https://github.com/yahoo/TensorFlowOnSpark — K_Augus, Nov 05 '22 at 11:43

score -1 · Answer 1 · answered Jun 16 '21 at 18:44

You could use Elephas (https://github.com/maxpumperla/elephas), which enables distributed training and inference of Keras models on Spark. Since you mentioned it's a Tensorflow model, this may require a conversion (detailed here: How can I convert a trained Tensorflow model to Keras?), but once it is a Keras model, it should be as simple as:

from elephas.spark_model import SparkModel


model = ... # load Keras model
data = ... # load in the data you want to perform inference on
spark_model = SparkModel(model)
predictions = spark_model.predict(data) # perform distributed inference on Spark cluster or local cores, depending on how Spark session is configured

Can this be used from Java as the question is tagged with Java? — Mridang Agarwalla, Jun 18 '21 at 06:46

How to deploy tensorflow model on spark to do inference only

1 Answers1