Why Tensorflow did not increased speed after GPU upgrade?

Question

I have Tensorflow 1.4 GPU version installed. Cuda8 is installed too.

I trained my pretty simple GAN network on MNIST data. I have AMD FX 8320 CPU, 16Gb system memory and SSD hard drive.

It took about 17 seconds per epoch on GeForce 720 GPU with 1GB memory. The training utilized about 25% of GPU and 99% of memory. CPU was loaded prettyhigh, close to 100%.

Then I inserted other video board with GeForce1050 Ti GPU and 4Gb memory instead of previous. The GPU was loaded only for 5-6%, memory was utilized for 93%. But I still got about 17s per epoch and high load for CPU.

So maybe Tensorflow has some settings to utilize more GPU? Or what is a cause of high CPU load and low GPU load?

Have you modified your code to make use of the multiple GPUs ? — Abhai Kollara, Dec 28 '17 at 17:49
I am also interested because my situation is the same. All the files are installed properly, the GPU is recognized but the speed seems to be the same, if not worse. — lucians, Dec 29 '17 at 08:45

score 1 · Answer 1 · answered Jul 25 '18 at 01:01

If you are training a simple GAN network it is fairly likely that your old GPU was not the bottleneck in the first place. So, improving it had no effect. If the amount of work done per sess.run() call very small, the overheads (executing your Python code, copying the input data to GPU, starting and running the TensorFlow executor, scheduling all the operations to GPU, etc) can dominate your computation.

The only sure way of knowing what happens is to profile. You can take a look here https://www.tensorflow.org/performance/performance_guide as a starting point. The timeline tool it mentions can be fairly useful. See here for more details: Can I measure the execution time of individual operations with TensorFlow?.

score 0 · Answer 2 · answered Sep 26 '18 at 11:34

Agree, for MNIST datasets, there are probably other bottlenecks in the system, not the GPU. I ran 2 side-by side TensorFlows,

Intel i7 4600M with NVIDIA Quadro K1100M GPU and 12 GB RAM, which is a 4th Gen Haswell Intel machine, and
Intel i5 8300U with No Cuda GPU and 16GB of RAM.

Basically 8th Gen Kaby Lake Intel CPU vs 4th Gen Intel, and I got:

4th Gen Intel chip with NVIDIA GPU: 311.5 sec, 315.9 sec, 313.0 sec to complete all 10 epocs on a MNIST run
8th Gen Intel chip with no GPU: 252.7 sec, 243.5 sec, 254.9 sec

So I'm running 20% faster with no GPU, just a newer generation of Intel chip.

Why Tensorflow did not increased speed after GPU upgrade?

2 Answers2