I am running a model in production. Unfortunately I only have access to CPUs and use a TensorFlow 1 model.
I am loading my model using tf.compat.v1
in TF2:
tf.compat.v1.disable_v2_behavior()
session = tf.compat.v1.Session()
tf.compat.v1.saved_model.loader.load(session, [tag_constants.SERVING], './model')
Unfortunately when running the session (self.session.run([out], feed_dict={input_data: batch})
), everytime data is incoming, the memory increases. But after the image is process, it lingers and the memory doesn't get released.
I searched, but only found options for GPUs. Is there a way to limit the CPU memory or solve this problem in a different mannner?