Here is an eminently surprising fact about floating point:
Single-precision (float
) arithmetic is not necessarily faster than double precision.
How can this be? Floating-point arithmetic is hard, so doing it with twice the precision is at least twice as hard and must take longer, right?
Well, no. Yes, it's more work to compute with higher precision, but as long as the work is being done by dedicated hardware (by some kind of floating point unit, or FPU), everything is probably happening in parallel. Double precision may be twice as hard, and there may therefore be twice as many transistors devoted to it, but it doesn't take any longer.
In fact, if you're on a system with an FPU that supports both single- and double-precision floating point, a good rule is: always use double
. The reason for this rule is that type float
is often inadequately accurate. So if you always use double
, you'll quite often avoid numerical inaccuracies (that would kill you, if you used float
), but it won't be any slower.
Now, everything I've said so far assumes that your FPU does support the types you care about, in hardware. If there's a floating-point type that's not supported in hardware, if it has to be emulated in software, it's obviously going to be slower, often much slower. There are at least three areas where this effect manifests:
- If you're using a microcontroller, with no FPU at all, it's common for all floating point to be implemented in software, and to be painfully slow. (I think it's also common for the double precision to be even slower, meaning that
float
may be advantageous there.)
- If you're using a nonstandard or less-than-standard type, that for that reason is implemented in software, it's obviously going to be slower. In particular: FPU's I'm familiar don't support a half-precision (16-bit) floating point type, so yes, it wouldn't be surprising if it was significantly slower than regular
float
or double
.
- Some GPU's have good support for single or half precision, but poor or no support for double.