Float vs Double Performance

Question

I did some timing tests and also read some articles like this one (last comment), and it looks like in Release build, float and double values take the same amount of processing time.

How is this possible? When float is less precise and smaller compared to double values, how can the CLR get doubles into the same processing time?

I don't think it's an exact duplicate as this one is asking the reason behind it where as the other user is asking if it's actually faster, but not necessarily why, — Joan Venge, Jan 06 '09 at 19:14
Supposedly an exact duplicate of *[Are doubles faster than floats in C#?](https://stackoverflow.com/questions/158889)* (claimed in 2009 by another user). — Peter Mortensen, Nov 18 '17 at 11:22

score 166 · Accepted Answer · answered Jan 06 '09 at 18:15

166

On x86 processors, at least, float and double will each be converted to a 10-byte real by the FPU for processing. The FPU doesn't have separate processing units for the different floating-point types it supports.

The age-old advice that float is faster than double applied 100 years ago when most CPUs didn't have built-in FPUs (and few people had separate FPU chips), so most floating-point manipulation was done in software. On these machines (which were powered by steam generated by the lava pits), it was faster to use floats. Now the only real benefit to floats is that they take up less space (which only matters if you have millions of them).

answered Jan 06 '09 at 18:15

P Daddy

28,912
9
68
92

9

Perhaps not 100 years ago... Some FPUs support native handling at float, double, and 80-bit levels and will execute faster at the shorter lengths. Some will actually execute some things slower at shorter lengths too... :-) – Brian Knoblauch Jan 06 '09 at 18:34
4

Possible exception: I think the time for divisions is dependent on the number of bits (1 clock cycle/2 bits). Timings I've made of float vs double division seem to tally with this. – Neil Coffey Jan 06 '09 at 18:39
25

Caveat for SIMD code - since you can pack 2x floats than doubles into a SIMD register (e.g. SSE), potentially operating on floats could be faster. But since it's C#, that's likely not going to happen. – Calyth Jan 06 '09 at 18:58
Why it's not likely to happen in C#? You mean currently or ever? – Joan Venge Jan 06 '09 at 19:02
Unless mono's SIMD extensions goes in MS' .net. Which is practically... never. – artificialidiot Jan 06 '09 at 20:24
1

I'm not sure, but some compilers may cast floats to doubles and back again, generating extra instructions. BTW once I debugged a FP library. Just to add 2 IEEE floating point numbers took ~300 instructions, it has to unpack to 80 bits, normalize, check for NaNs, and undo all that and incidentally add somewhere in the middle. I'm sure FP processors have hardware help, but they still have to do all these things. – Mike Dunlavey May 04 '09 at 17:19
14

@P Daddy: I'd say the space advantage matters at every level of the cache hierachy. When your first level data cache is 16KB big and you are crunching an array of 4000 numbers, float could easily be faster. – Peter G. Feb 09 '11 at 12:35
3

This answer is not entirely true. 80-bit math is slower that 64 bit math. 64-bit code always uses the 64-bit the SIMD (SSE) vectors to process math, which is why the results are faster. 32-bit math will execute in the same time as 64. To prove that the 64-bit SSE math is faster than 80-bit FPU math, code the same math in something like Delphi, and the results speak for themselves. Further, on a 32-bit OS, 32-bit math is slower due to loads and stores and sometimes ops. Finally, 32 bit math can require 1/2 the memory shuffling (for large arrays), so will naturally be faster in this case. – IamIC Sep 12 '13 at 06:39
6

@artificialidiot Never say never ;). SIMD is supported in .NET since 4.6 – ghord Oct 07 '15 at 09:58
I think you have to benchmark your own setup. On my own HP Workstation with Xeon processors, I found that float was around 35% faster than double for my application. But you have to be doing many hundreds of millions of calculations before you'll notice the difference, so only relevant for genuinely big data. – Tullochgorum Apr 28 '20 at 09:58
Also the `Math` library only operates with double.. – Erik Thysell Jan 16 '21 at 11:12

score 15 · Answer 2 · edited Nov 18 '17 at 11:31

It depends on 32-bit or 64-bit system. If you compile to 64-bit, double will be faster. Compiled to 32-bit on 64-bit (machine and OS) made float around 30% faster:

    public static void doubleTest(int loop)
    {
        Console.Write("double: ");
        for (int i = 0; i < loop; i++)
        {
            double a = 1000, b = 45, c = 12000, d = 2, e = 7, f = 1024;
            a = Math.Sin(a);
            b = Math.Asin(b);
            c = Math.Sqrt(c);
            d = d + d - d + d;
            e = e * e + e * e;
            f = f / f / f / f / f;
        }
    }

    public static void floatTest(int loop)
    {
        Console.Write("float: ");
        for (int i = 0; i < loop; i++)
        {
            float a = 1000, b = 45, c = 12000, d = 2, e = 7, f = 1024;
            a = (float) Math.Sin(a);
            b = (float) Math.Asin(b);
            c = (float) Math.Sqrt(c);
            d = d + d - d + d;
            e = e * e + e * e;
            f = f / f / f / f / f;
        }
    }

    static void Main(string[] args)
    {
        DateTime time = DateTime.Now;
        doubleTest(5 * 1000000);
        Console.WriteLine("milliseconds: " + (DateTime.Now - time).TotalMilliseconds);

        time = DateTime.Now;
        floatTest(5 * 1000000);
        Console.WriteLine("milliseconds: " + (DateTime.Now - time).TotalMilliseconds);

        Thread.Sleep(5000);
    }

Have you considered those 30% could be because of the extra casts you use?? — Rasmus Damgaard Nielsen, Oct 19 '14 at 10:16
@RasmusDamgaardNielsen The casts are part of the problem since `Math` works with double. But you misread my post: my tests showed me float better in performance. — Bitterblue, Oct 20 '14 at 06:45
The results posted above are bogus. My tests show that on an older 32-bit machine with .NET 4.0 in Release mode, the `float` and `double` performance are virtually identical. Less than 0.3% difference when averaged over many independent trials, where each trial exercised multiply, divide, and addition ops on consecutively chained variables (to avoid any compiler optimizations getting in the way). I tried a second set of tests with `Math.Sin()` and `Math.Sqrt()` and also got identical results. — Special Sauce, Dec 06 '15 at 02:09

score 12 · Answer 3 · edited Apr 17 '14 at 06:26

12

I had a small project where I used CUDA and I can remember that float was faster than double there, too. For once the traffic between Host and Device is lower (Host is the CPU and the "normal" RAM and Device is the GPU and the corresponding RAM there). But even if the data resides on the Device all the time it's slower. I think I read somewhere that this has changed recently or is supposed to change with the next generation, but I'm not sure.

So it seems that the GPU simply can't handle double precision natively in those cases, which would also explain why GLFloat is usually used rather than GLDouble.

(As I said it's only as far as I can remember, just stumbled upon this while searching for float vs. double on a CPU.)

edited Apr 17 '14 at 06:26

Jacob Hacker

1,501
1
13
16

answered Jul 05 '10 at 23:24

Mene

3,739
21
40

7

GPUs are totally different animals than FPUs. As others mentioned FPU's native format is the 80 bit double precision. And that's for a long time now. GPUs however approach this field from single precision. It's **well known** that their DP FP (double precision floating point) performance is often exactly the half of the SP FP performance. It seems that they often have SP floating point units, and they have to reuse the unit to cover the double precision. Which yields exactly two cycles compared to one. That's a **huge performance difference**, which stunned me when I faced with it. – Csaba Toth Aug 08 '13 at 18:26
1

Some scientific computations require DP FP, and the lead GPU manufacturers didn't advertise the performance penalty around that. Now they (AMD, nVidia) seem to somewhat improve on that DP vs SP topic. Intel Xeon Phi's many core contains Pentium's FPUs, and notice that Intel emphasized it's **double precision** capabilities. That's where it maybe really able to compete with GPGPU monsters. – Csaba Toth Aug 08 '13 at 18:30

score 11 · Answer 4 · edited Nov 18 '17 at 11:30

11

There are still some cases where floats are preferred however - with OpenGL coding for example it's far more common to use the GLFloat datatype (generally mapped directly to 16 bit float) as it is more efficient on most GPUs than GLDouble.

edited Nov 18 '17 at 11:30

Peter Mortensen

30,738
21
105
131

answered Jan 06 '09 at 18:20

Cruachan

15,733
5
59
112

3

Maybe due to higher data throughput? If you have a matrix of numbers (z-buffer etc.), the data size becomes more important, and avoiding conversions between float and double speeds up handling. My guess. – Lucero Apr 16 '09 at 19:34
2

Undoubtedly throughput. Also given the specialised context there is unlikely anything visible to be gained from using doubles over floats so why waste the memory - especially as it is in shorter supply on GPUs than CPUs – Cruachan Apr 16 '09 at 20:59
1

Throughput **and** also the fact that SP FP (single precision floating point) is more the native format of the GPU internal FPUs than DP FP (double precision). See my comment to @Mene's answer. GPU's and CPU FPUs are very different animals, the CPU's FPU is thinking in DP FP. – Csaba Toth Aug 08 '13 at 18:34
[float vs double on graphics hardware](https://stackoverflow.com/questions/2079906/float-vs-double-on-graphics-hardware) – zwcloud Sep 11 '17 at 02:58

Float vs Double Performance

4 Answers4

Linked