1

I am trying to run tensorflow-gpu version 2.4.0-dev20200828 (a tf-nightly build) for a convolutional neural network implementation. Some other details:

  • The version of python is Python 3.8.5.
  • Running Windows 10
  • Using an nVidia RTX 2080 which has 8 GB VRAM
  • Cuda Version 11.1

The following snippet is what I run:

import tensorflow as tf
from tensorflow import keras

gpus = tf.config.experimental.list_physical_devices('GPU')
if gpus:
  try:
    tf.config.experimental.set_virtual_device_configuration(
        gpus[0],
        [tf.config.experimental.VirtualDeviceConfiguration(memory_limit=1024)])
    logical_gpus = tf.config.experimental.list_logical_devices('GPU')
    print(len(gpus), "Physical GPUs,", len(logical_gpus), "Logical GPUs")
  except RuntimeError as e:
    # Virtual devices must be set before GPUs have been initialized
    print(e)

vgg_16 = keras.applications.VGG16(include_top=False, input_shape=(600, 600, 3))
random_image = np.random.rand(1, 600, 600, 3)
output = vgg_16(random_image)

The code for the memory configuration was taken from answers from here

The issue I am having is that my GPU has 8GB of VRAM, and I need to be able to run the CNN with relatively large image batch sizes. The example is executed on a single image, but surprisingly I seem to only be able to increase the batch size to about 2-3 600 by 600 images. The code taken as per the comments says that it:

Restrict TensorFlow to only allocate 1GB of memory on the first GPU, which is clearly not ideal.

On the one hand if I allocate more, say 4000MB, I get errors such as:

E tensorflow/stream_executor/cuda/cuda_dnn.cc:325] Could not create cudnn handle: CUDNN_STATUS_ALLOC_FAILED

If I leave it as 1024 MB, I get messages like:

Allocator (GPU_0_bfc) ran out of memory trying to allocate 3.25GiB with freed_by_count=0. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.

Any insights/resources on how to understand this issue much appreciated. I'd be willing to switch to another version of tensorflow/python/cuda if necessary, but ultimately I just want to have a deeper understanding of what this issue is.

IntegrateThis
  • 853
  • 2
  • 16
  • 39
  • You don't need to fiddle with anything in tensorflow to use all the RAM in your GPU, so why are you doing it? What is the real initial issue you were having? – Dr. Snoopy Sep 02 '20 at 01:25
  • @Dr.Snoopy well if I didn't add the memory fiddling code I get the warning message:failed to create cublas handle: CUBLAS_STATUS_ALLOC_FAILED – IntegrateThis Sep 02 '20 at 01:27
  • Have you considered that this network is just too big for this GPU? – Dr. Snoopy Sep 02 '20 at 01:29
  • @Dr.Snoopy Is 8GB Vram really not enough? I'm not super experienced in this matter, so I don't know. – IntegrateThis Sep 02 '20 at 01:30
  • Yeah it could be, VGG requires a lot of RAM for 224x224 images, and increasing to 600x60 will use at least 4-6 times more RAM. With smaller images it will probably work. – Dr. Snoopy Sep 02 '20 at 01:33
  • Also running development version of TensorFlow is not a good idea, use a stable version like 2.3 – Dr. Snoopy Sep 02 '20 at 01:40
  • @Dr.Snoopy today I tried setting the entirety of the weights in the VGG16 to not trainable and I am still limited to a batch size of 2 (600,600,3) images. I think something else is the problem here. – IntegrateThis Sep 03 '20 at 18:10

1 Answers1

1

A better way to control memory usage is by letting memory growth. You should remove all the above codes about gpus and use this instead:

for gpu in tf.config.experimental.list_physical_devices('GPU'):
    tf.config.experimental.set_memory_growth(gpu, True)

Additionally, you can resize or crop the input image to smaller size to further reduce memory usage.

THN
  • 3,351
  • 3
  • 26
  • 40