9

As I understand, tf.reset_default_graph() only creates a new graph and sets it equal to the default graph. So, the previously created tensors would just be lying around occupying the memory. I have also read the unreferenced tensors are not garbage collected (like normal variables in Python are).

If I am running a cross-validation to search for a set of hyperparameters and thus creating the same graph, again and again, how do I get rid of the previously created tensors?

nbro
  • 15,395
  • 32
  • 113
  • 196
figs_and_nuts
  • 4,870
  • 2
  • 31
  • 56
  • 1
    did you ever find an answer? – SantoshGupta7 Jun 22 '20 at 23:03
  • This is related or duplicate [Tensorflow delete graph and free up resources](https://stackoverflow.com/q/58435961/1782792) @SantoshGupta7 There is a bit of a misconception in the question, in a setup like cross-validation the graph and tensors shouldn't usually take a lot of space, but the session (where variable values are stored and resources are pooled for training) might. Graphs become big when they are "frozen" (variables converted to constants) and/or when they have a _very_ large number of operations. In any case if you don't keep references to objects they should be garbage collected. – jdehesa Jun 23 '20 at 09:37
  • 1
    @jdehesa i dont think in tensorflow 1 the unreferenced tensors were grabage collected – figs_and_nuts Jun 30 '20 at 06:32

1 Answers1

6

I had the same problem when designing experiments, after researching about this problem, the only solution that worked for me is this one. As you can read in that link, it seems to be a design flaw and the TF team doesn't seem to care about fixing.

The solution is to create a new process for each cross-validation iteration. So when the process finishes the system kills it and releases the resources automatically.

import multiprocessing

def evaluate(...):
    import tensorflow as tf
    # Your logic

for ... in cross_valiadtion_loop:
    process_eval = multiprocessing.Process(target=evaluate, args=(...))
    process_eval.start()
    process_eval.join()
Pedrolarben
  • 1,205
  • 10
  • 19