Convert double to float goes wrong - C#

Question

I need to calculate bounds for rectangleF type

For some reason the casting from double to float is not evaluated precisely as it should be

This is an example of such calculate


float MinX = 0f, MaxX = 0f;
float MinY = 0f, MaxY = 0f;
float BoundsWidth = 0.2f;
float BoundsHeight = 0.1f;
double BoundsY = 2333638.6551984739;
double BoundsX = 895.0999755859375;

MinX = (float)BoundsX;
MinY = (float)BoundsY;

var MaxX_Defect = BoundsX + BoundsWidth;
var MaxY_Defect = BoundsY + BoundsHeight;
MaxX = (float)(MaxX_Defect);
MaxY = (float)(MaxY_Defect);

When I'm trying to calculate the hight MaxY-MinY its evaluated as 0 instead of 0.1f

How can I fix this?

`float` does not have enough precision. You can try this: `float a = 2333638; float b = a + 0.1f; Console.WriteLine(b - a);`, it will output 0. — Lasse V. Karlsen, Jan 11 '22 at 09:24
By converting a `double` to a `float` you're downgrading precision — ChrisBD, Jan 11 '22 at 09:24
[floating point numbers have limited precision](https://learn.microsoft.com/en-us/dotnet/csharp/language-reference/builtin-types/floating-point-numeric-types). simple as that. — Franz Gleichmann, Jan 11 '22 at 09:25
You can use `double` for all those calculations to get more precision, but `double` is also limited in precision so if you have large enough numbers in a `double` you will get the same problem there. Specifically, if you do my `b-a` example in my other comment but switch to `double` you will need to use a number like `2333638000000000` to have the same issue. — Lasse V. Karlsen, Jan 11 '22 at 09:27
From what I read at the documentation before I didn't over the limit of float, and the precision need to get up to 9 digits.. so I'm confused a little bit @Franz Gleichmann — tamir1020, Jan 11 '22 at 09:37
I don't care to use double at this calculation but as I wrote, I need it for a ```rectangleF``` type @Lasse V.Karlsen — tamir1020, Jan 11 '22 at 09:38
I understand that, but `float` will still have limited precision, regardless of what you want. So you need to either accept the problem or deal with it. — Lasse V. Karlsen, Jan 11 '22 at 09:46
@tamir1020 the point is: do not _ever_ expect floating point numbers to be _precise_ at all. they're just approximations. if you need arbitrary, decimal precision, use `Decimal` instead. (also: i suggest stepping through your code with the debugger and looking at your values. `MinY` gets rounded up to `2333638.75` - because of the limited precision. which is _precisely_ the root of your problem.) — Franz Gleichmann, Jan 11 '22 at 09:49
Re “From what I read at the documentation”: [Unfortunately, Microsoft’s documentation about `float` precision is nonsensical and wrong.](https://stackoverflow.com/a/61614323/298225) The `float` type does not guarantee you 9 digits. — Eric Postpischil, Jan 11 '22 at 11:33
@EricPostpischil, your "Unfortunately ..." answer is perhaps one of the best I've read about float accuracy and resolution. By the way, shouldn't this question be closed on the grounds it's a duplicate of [Is floating point math broken?](https://stackoverflow.com/questions/588004/is-floating-point-math-broken)? — Arc, Jan 12 '22 at 23:09
Does this answer your question? [Is floating point math broken?](https://stackoverflow.com/questions/588004/is-floating-point-math-broken) — Arc, Jan 12 '22 at 23:10
@Arc: When people have specific issues, I avoid marking questions as duplicates of that one because it’s overall message is to avoid floating-point, or, if you use it, to regard it as mysterious or unreliable. A better course is to educate people. Ultimately, we ought to have a variety of questions that illuminate floating-point arithmetic, but it is difficult to build that from random questions. — Eric Postpischil, Jan 13 '22 at 00:40
@EricPostpischil, I understand, I don't see it diminishing of floats exactly like you do, but I don't think its contents are very enlightening either. As of today, it is the big bucket where about a third of the `precision` tagged questions on float arithmetic go to die. But you are right, tailored answers with specific numerics like the one you provided here - and in many other cases - have far more value in terms of creating knowledge than just closing someone's issue and pointing to work already done. — Arc, Jan 13 '22 at 04:59
Your answer is nearly perfect, I would just suggest you link some references to it, like "What every computer scientist should know..." or perhaps Kunth's chapter 3 of TAOCP, so people are encouraged to seek and read more about floats. — Arc, Jan 13 '22 at 05:03

Eric Postpischil · Answer 1 · 2022-01-13T18:40:51.833

The line float BoundsHeight = 0.1f; converts .1 to the nearest value representable in float, resulting in BoundsHeight being 0.100000001490116119384765625

The line double BoundsY = 2,333,638.6551984739; similarly converts to double, setting BoundsY to 2,333,638.6551984739489853382110595703125.

The line float MinY = BoundsY; converts that to float, setting MinY to 2,333,638.75.

The line double MaxY_Defect = BoundsY + BoundsHeight; computes using double (I presume; I am not familiar with C# semantics), setting MaxY_Defect to 2,333,638.7551984754391014575958251953125.

The line float MaxY = (float)(MaxY_Defect); converts that to float, setting MaxY to 2,333,638.75.

Then we can see that MinY and MaxY have the same value, so of course MaxY-MinY is zero.

Quite simply, float does not have enough precision to distinguish between 2,333,638.6551984739489853382110595703125 and 2,333,638.7551984754391014575958251953125. At the scale of 2,333,638, the distance between adjacent representable numbers in the float format is .25. This is because the format has 24 bits for the significand (the fraction portion of the floating-point representation). 2,333,638 is between 2²¹ and 2²², so the exponent in its floating-point representation scales the significand to have bits representing values from 2²¹ to 2⁻² (from 21 to −2, inclusive, is 24 positions). So changing the significand by 1 in its lowest bit changes the represented number by 2⁻² = .25.

Thus, when 2,333,638.655… and 2,333,638.755… are converted to float, they have the same result, 2,333,638.75.

You cannot use float to distinguish between coordinates or sizes that are this close at that magnitude. You can use double or you might be able to translate the coordinates to be nearer the origin (so their magnitudes are smaller, putting them in a region where the float resolution is finer).

As long as the final result is small, you could do the intermediate calculations using double but still represent the final result well using float.

For additional information on floating-point arithmetic, I recommend Handbook of Floating-Point Arithmetic by Muller, Brunie, de Dinechin, Jeannerod, Joldes, Lefèvre, Melquiond, Revol, and Torres. A PDF of a prior edition appears to be available here. The official IEEE 754-2019 Standard for Floating-Point Arithmetic is available here.

Convert double to float goes wrong - C#

1 Answers1