0

GPU is NVIDIA RTX A2000 8GB

Keras 2.10.0

Tensorflow 2.10.0

CUDNN 8.1.0.77

Once I load build a model ( before training ), I found that GPU memory is almost fully allocated

5754MiB / 8192MiB, causing "Resource Exhausted : Out Of Memory (OOM) error" in the following training.

The model is a pretty standard U-net, the training data are 1980 256*256 images. Batch size is 16 but this shouldn't be relevant since the GPU is filled as soon as I build the model so before starting the training.

What could be the reason, how can prevent this?

Gpu stays filled untill I restart the kernel or restart the PC.

p.s. I have already checked this, doesn't help

Before loading the data and building the model, I run the following:

gpus = tf.config.list_physical_devices('GPU')
if gpus:
  try:
        
    logical_gpus = tf.config.list_logical_devices('GPU')
    print(len(gpus), "Physical GPUs,", len(logical_gpus), "Logical GPUs")

    tf.config.experimental.set_memory_growth(gpus[0], True)
    
  except RuntimeError as e:
    # Memory growth must be set before GPUs have been initialized
    print(e)
James
  • 1
  • 2

0 Answers0