Now we have used TensorFlow to train and export an model. We can implement the inference service with this model just like how tensorflow/serving
does.
I have a question about whether the tf.Session
object is thread-safe or not. If it's true, we may initialize the object after starting and use the singleton object to process the concurrent requests.