0

I have installed the tensorflow-gpu package, as well as all requirements for running tensorflow on the GPU.

Now I wanted to test it, so I downloaded the retrain.py script from the image retraining Tutorial (https://www.tensorflow.org/tutorials/image_retraining) and ran it with my own images (Horses in one Folder, Cars Planes and some more in Folder "unknown"), but it doesn't run on the GPU (0% usage), only on the CPU (7% usage)

GPU: Nvidia GeForce GTX 1060 6GB, Driver Version: 390.59, Cuda Version: 9.0.176

CPU: Ryzen 7 1700x

OS: Ubuntu Server 18.04

  • how do you check gpu usage ? – Sunreef May 29 '18 at 11:22
  • @Sunreef I use "nvidia-smi" – Wilhelm Baumotte May 30 '18 at 07:37
  • Try using "watch -n 0.1 nvidia-smi" instead to follow the evolution of GPU usage. Keep it running in a separate terminal – Sunreef May 30 '18 at 07:46
  • @Sunreef I'm doing exactly that. – Wilhelm Baumotte Jun 04 '18 at 06:52
  • Are you using instructions like `with tf.device('/device:GPU:0'):` in your script when defining your model ? Without these, Tensorflow will execute everything on CPU. – Sunreef Jun 04 '18 at 08:45
  • @Sunreef, The "Without [`with tf.device('GPU:0'):`], Tensorflow will execute everything on CPU" statement is incorrect. If no explicit device placement is specified, TensorFlow will decide on device placement automatically. This decision is fairly complex (can even involve a neural net if you enable experimental features) but in vast majority of cases most ops will be placed on a GPU if one is available. – iga Jun 06 '18 at 21:45

1 Answers1

0

0% GPU usage and 7% GPU usage is suspicious. It might be that there is some non-compute bottleneck, e.g. reading images from a remote disk. I would try doing a bunch of large matmuls to test this.

In general, you do the following: List available devices: How to get current available GPUs in tensorflow?. If you see a GPU, TensorFlow recognized it. At this point, ask TensorFlow to tell you which devices it is actually using for your operations: see the 'Logging Device Placement' section in https://www.tensorflow.org/programmers_guide/using_gpu. If you see GPU being used here, all is good. Perhaps your operations are just too small or there is another bottleneck that leads to close to zero GPU usage.

iga
  • 3,571
  • 1
  • 12
  • 22
  • There are 5507 Lines with "GPU" in them, and 21187 Lines with "CPU" in them. Also, when tensorflow is running, one CPU Core is used 100%, while the other Cores aren't used at all. The Images are all stored locally on an HDD, but I currently only have 30 Pictures for Horses and 50 for the category "unknown". – Wilhelm Baumotte Jun 04 '18 at 06:52
  • If your line counts refer to which device the operation is placed on, I guess GPU is used. If you are now wondering why it is used less that what you expect, I would suggest using the timeline tool (https://towardsdatascience.com/howto-profile-tensorflow-1a49fb18073d) to look for the bottleneck and ask another detailed question if you get stuck. – iga Jun 06 '18 at 21:39