Leveraging high precision gpu's in tensor flow

Question

Hi I was reading the using GPUs page at tensor flow and I was wondering if gpu precision performance was ever a factor in tensor flow. For example given a machine with two cards,

gaming gpu

+

workstation gpu

is there any implementation that would provide the workstation card's higher precision performance could overcome the slower clock speed?

I'm not sure if these situations would exist in the context of gradient decent or network performance after training or elsewhere entirely but I would love to get some more information on the topic!

Thanks in advance.

As with most preformance problems, you'd have to actually try some computations on both GPU to compare the results. Also I'm sorry but I didn't fully understand your sentences, could you try rewriting your text more clearly please ? — J. Martinot-Lagarde, Jul 05 '18 at 20:21

McAngus · Accepted Answer · 2018-07-05T23:31:54.140

TL;DR

The opposite is actually the case. Higher precision calculations are less desired by frameworks like TensorFlow. This is due to slower training and larger models (more ram and disc space).

The long version

Neural networks actually benefit from using lower precision representations. This paper is a good introduction to the topic.

The key finding of our exploration is that deep neural networks can be trained using low-precision fixed-point arithmetic, provided that the stochastic rounding scheme is applied while operating on fixed-point numbers.

They use 16 bit fixed point number rather than the much higher precession 32 bit floating point number (more information on their difference here).

The following image was taken from that paper. It shows the test error for different rounding schemes as well as the number of bits dedicated to the integer part of the fixed point representation. As you can see the solid red and blue lines (16 bit fixed) have a very similar error to the black line (32 bit float).

The main benefit/driver for going to a lower precision is computational cost and storage of weights. So the higher precision hardware would not give enough of an accuracy increase to out way the cost of slower computation.

Studies like this I believe are a large driver behind the specs for neural network specific processing hardware, such as Google's new TPU. Even though most GPUs don't support 16 bit floats yet Google is working to support it.

From the introduction of the provided paper -> "the natural error resiliency of neural network architectures and learning algorithms is well-documented, setting them apart from more traditional workloads that typically require precise computations and number representations with high dynamic range. It is well appreciated that in the presence of statistical approximation and estimation errors, high-precision computation in the context of learning is rather unnecessary" (I may award the bounty in 4 hours) — kpie, Jul 06 '18 at 15:37

Leveraging high precision gpu's in tensor flow

1 Answers1