Why nvidia-smi GPU performance is low although it is not used

Question

I am a newbie in GPU based training and Deep learning models. I am running cDCGAN (Conditonal DCGAN) in tensorflow on my 2 Nvidia GTX 1080 GPU's. My data-set consists of around 32,0000 images with size 64*64 and 2350 class labels. My batch size is very small i.e. 10, as i was facing OOM error (OOM when allocating tensor with shape[32,64,64,2351]) with a large batch_size.

The training is very slow which i understand is down to the batch size (correct me if i am wrong). If i do help -n 1 nvidia-smi, I get the following output.

The GPU:0 is mainly used, as the Volatile GPU-Util gives me around 0%-65% whereas GPU:1 is always 0%-3% max. Performance for GPU:0 is always in P2 whereas GPU:1 is mostly P8 or sometimes P2. I have the following questions.

1) Why is GPU:1 not used more than the current state and why it has mostly P8 Perf although not used?
2) Is this poor training process speed down to my batch size only or there can be some other reasons?
3) How Can I improve performance? 4) How can I avoid OOM error with a bigger batch size?

Edit 1:

Model Details are as follows

Generator:

I have 4 layers (fully connected, UpSampling2d-conv2d, UpSampling2d-conv2d, conv2d).
W1 is of the shape [X+Y, 16*16*128] i.e. (2450, 32768), w2 [3, 3, 128, 64], w3 [3, 3, 64, 32], w4 [[3, 3, 32, 1]] respectively

Discriminator

It has five layers (conv2d, conv2d, conv2d, conv2d, fully connected).
w1 [5, 5, X+Y, 64] i.e. (5, 5, 2351, 64), w2 [3, 3, 64, 64], w3 [3, 3, 64, 128], w4 [2, 2, 128, 256], [16*16*256, 1] respectively.

Yes, the batch size is too small, why are you getting OOM errors? How big is the model? — Dr. Snoopy, Mar 13 '19 at 06:55

Why nvidia-smi GPU performance is low although it is not used

0 Answers0