prevent gpu memory allocation for MonitoredTrainingSession

Question

I am trying to restrict GPU memory allocation in a MonitoredTrainingSession.

The methods of setting tf.GPUOptions as shown here: How to prevent tensorflow from allocating the totality of a GPU memory? do not work out in the case of MonitoredTrainingSession.

I tried:

gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=.1)
# or allow_growth=True
config = tf.ConfigProto(allow_soft_placement=False,
                        device_filters=filters,
                        gpu_options=gpu_options)

scaffold = tf.train.Scaffold(saver=tf.train.Saver(max_to_keep=100, keep_checkpoint_every_n_hours=.5))

with tf.train.MonitoredTrainingSession(
                server.target,
                is_chief=True,
                checkpoint_dir=log_dir,
                scaffold=scaffold,
                save_checkpoint_secs=600,
                save_summaries_secs=30,
                log_step_count_steps=int(1e7),
                config=config) as session:

Despite using tf.GPUOptions memory consumption is 10189MiB / 11175MiB

I'm dealing with the same issue - still haven't figured it out. — VanTheMan, Aug 24 '19 at 19:21

score 0 · Answer 1 · answered Sep 11 '19 at 07:08

0

I figured out what was the problem: the first session that is opened needs to include the memory options.

Hence, if in doubt, open a dummy session with memory limit just at the beginning of the script.

answered Sep 11 '19 at 07:08

diegosunshine

61
5

prevent gpu memory allocation for MonitoredTrainingSession

1 Answers1