IEEE-754 standard on NVIDIA GPU (sm_13)

Question

If I perform a float (single precision) operation on a Host and a Device (GPU arch sm_13) , then will the values be different ?

The GT200 family of GPUs didn't support IEEE-754 single precision operations. But even if they did, the answer would probably be yes, the results *might* be different. — talonmies, Apr 26 '12 at 13:45
@talonmies I have reformulated my query [here](http://stackoverflow.com/q/10335782/842808). Kindly take a look. Thanks — Abhinav, Apr 26 '12 at 14:47

Jonathan Dursi · Accepted Answer · 2012-04-27T11:30:46.480

9

A good discussion of this is availble in a whitepaper from NVIDIA. Basically:

IEEE-754 is implemented by almost everything currently;
Even between faithful implementation of this standard, you can still see differences in results (famously, Intel's doing 80-bit internally for double precision), or high optimization settings with your compiler can change results
Compute capability 2.0 and later NVIDIA cards support IEEE-754 in both single and double precision, with only very small caveats
- Some rounding modes aren't supported for some operations - this is only relevant if you explicitly change rounding modes in your code
- There's some subtleties involving fused multiply and adds
- CUDA also provides (slightly) lower precision but faster implementations of several operations, and of course if you use those explicitly or implicitly (with compiler options) you naturally won't get full ieee-754 results
Compute capability 1.3 cards support ieee-754 as above in double precision but not in single precision; (single precision doesn't support denormal - eg very small - numbers, no FMAs, square root and division aren't fully accurate)
Compute capability 1.2 cards only have single precision and those aren't full ieee-754 as above.

edited Apr 27 '12 at 11:30

answered Apr 26 '12 at 14:01

Jonathan Dursi

If despite sticking to IEEE754 standard, the float point values calculated on a CPU and a GPU be different depending on, (a) hardware optimization , e.g. Intels 80-bit method (b) compiler optimizations, etc, then why do we call it a standard ? – Abhinav Apr 26 '12 at 15:21
2

@Abhinav: It is a standard because it defines storage rules, formats, rounding rules, operations, interchange formats and exceptions. It (depending on which version) also defines reproducibility criteria. But everything has a tolerance. It means floating point will work the *same way* on any standards conforming platform. It doesn't mean that results will be bitwise identical. – talonmies Apr 26 '12 at 16:18
@Abhinav : trust me, before IEEE754 things used to be much, much worse. – Jonathan Dursi Apr 27 '12 at 03:14
1

@Abhinav: and compiler optimizations are a completely different beast. A compiler can re-order the operations in your code if you let it; in that case, even if all operations are peformed exactly the same way on different systems, you'll get different answers because their _order_ is different - floating point mathematics is inherently non-commutative. – Jonathan Dursi Apr 27 '12 at 11:29

1 Answers1