2

I am using a Tensorflow DNNClassifier running on a CPU. I have finished training and am now calling estimator.predict repeatedly and after a few thousand calls I get the following. I'm confused because I assumed that making a prediction would not in itself increase memory (I saw some other people posing a similar error but they were using GPUs and seeing the error during training).

....
File "C:\Users\Zvi\AppData\Local\Programs\Python\Python36\lib\site-packages\tensorflow\python\framework\ops.py", line 1654, in __init__
self._traceback = self._graph._extract_stack()  # pylint: disable=protected-access

ResourceExhaustedError (see above for traceback): OOM when allocating tensor with shape[973771,128] and type float on /job:localhost/replica:0/task:0/device:CPU:0 by allocator cpu
 [[Node: save/AssignVariableOp = AssignVariableOp[dtype=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"](dnn/input_from_feature_columns/input_layer/product_hub_module_embedding/module/embeddings/part_0, save/Identity_7)]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
Maxim
  • 52,561
  • 27
  • 155
  • 209
Mark
  • 168
  • 1
  • 13

1 Answers1

0

I'm confused because I assumed that making a prediction would not in itself increase memory.

That is actually not true, so it's the likely reason for OOM. Estimator.predict() rebuilds the graph from scratch on each call and loads the weights from disk for inference. See this question and this issue on GitHub for more details. Yes, the graph, tensors are other objects become available for GC after the call, but it doesn't mean all of them are collected immediately.

When this method is called a thousand times, the stability of the whole application depends on how quickly previously allocated memory can be reclaimed. But python GC can be postponed for a long time. And even if GC collects the garbage regularly, you may still face the problem of defragmentation.

This means that you should try to call Estimator.predict() fewer times with more input data to predict, or migrate from estimator API to keras, tf slim or pure tensorflow implementation.

Maxim
  • 52,561
  • 27
  • 155
  • 209