15

Mathematically, consider for this question the rational number

8725724278030350 / 2**48

where ** in the denominator denotes exponentiation, i.e. the denominator is 2 to the 48th power. (The fraction is not in lowest terms, reducible by 2.) This number is exactly representable as a System.Double. Its decimal expansion is

31.0000000000000'49'73799150320701301097869873046875 (exact)

where the apostrophes do not represent missing digits but merely mark the boudaries where rounding to 15 resp. 17 digits is to be performed.

Note the following: If this number is rounded to 15 digits, the result will be 31 (followed by thirteen 0s) because the next digits (49...) begin with a 4 (meaning round down). But if the number is first rounded to 17 digits and then rounded to 15 digits, the result could be 31.0000000000001. This is because the first rounding rounds up by increasing the 49... digits to 50 (terminates) (next digits were 73...), and the second rounding might then round up again (when the midpoint-rounding rule says "round away from zero").

(There are many more numbers with the above characteristics, of course.)

Now, it turns out that .NET's standard string representation of this number is "31.0000000000001". The question: Isn't this a bug? By standard string representation we mean the String produced by the parameterles Double.ToString() instance method which is of course identical to what is produced by ToString("G").

An interesting thing to note is that if you cast the above number to System.Decimal then you get a decimal that is 31 exactly! See this Stack Overflow question for a discussion of the surprising fact that casting a Double to Decimal involves first rounding to 15 digits. This means that casting to Decimal makes a correct round to 15 digits, whereas calling ToSting() makes an incorrect one.

To sum up, we have a floating-point number that, when output to the user, is 31.0000000000001, but when converted to Decimal (where 29 digits are available), becomes 31 exactly. This is unfortunate.

Here's some C# code for you to verify the problem:

static void Main()
{
  const double evil = 31.0000000000000497;
  string exactString = DoubleConverter.ToExactString(evil); // Jon Skeet, http://csharpindepth.com/Articles/General/FloatingPoint.aspx 

  Console.WriteLine("Exact value (Jon Skeet): {0}", exactString);   // writes 31.00000000000004973799150320701301097869873046875
  Console.WriteLine("General format (G): {0}", evil);               // writes 31.0000000000001
  Console.WriteLine("Round-trip format (R): {0:R}", evil);          // writes 31.00000000000005

  Console.WriteLine();
  Console.WriteLine("Binary repr.: {0}", String.Join(", ", BitConverter.GetBytes(evil).Select(b => "0x" + b.ToString("X2"))));

  Console.WriteLine();
  decimal converted = (decimal)evil;
  Console.WriteLine("Decimal version: {0}", converted);             // writes 31
  decimal preciseDecimal = decimal.Parse(exactString, CultureInfo.InvariantCulture);
  Console.WriteLine("Better decimal: {0}", preciseDecimal);         // writes 31.000000000000049737991503207
}

The above code uses Skeet's ToExactString method. If you don't want to use his stuff (can be found through the URL), just delete the code lines above dependent on exactString. You can still see how the Double in question (evil) is rounded and cast.

ADDITION:

OK, so I tested some more numbers, and here's a table:

  exact value (truncated)       "R" format         "G" format     decimal cast
 -------------------------  ------------------  ----------------  ------------
 6.00000000000000'53'29...  6.0000000000000053  6.00000000000001  6
 9.00000000000000'53'29...  9.0000000000000053  9.00000000000001  9
 30.0000000000000'49'73...  30.00000000000005   30.0000000000001  30
 50.0000000000000'49'73...  50.00000000000005   50.0000000000001  50
 200.000000000000'51'15...  200.00000000000051  200.000000000001  200
 500.000000000000'51'15...  500.00000000000051  500.000000000001  500
 1020.00000000000'50'02...  1020.000000000005   1020.00000000001  1020
 2000.00000000000'50'02...  2000.000000000005   2000.00000000001  2000
 3000.00000000000'50'02...  3000.000000000005   3000.00000000001  3000
 9000.00000000000'54'56...  9000.0000000000055  9000.00000000001  9000
 20000.0000000000'50'93...  20000.000000000051  20000.0000000001  20000
 50000.0000000000'50'93...  50000.000000000051  50000.0000000001  50000
 500000.000000000'52'38...  500000.00000000052  500000.000000001  500000
 1020000.00000000'50'05...  1020000.000000005   1020000.00000001  1020000

The first column gives the exact (though truncated) value that the Double represent. The second column gives the string representation from the "R" format string. The third column gives the usual string representation. And finally the fourth column gives the System.Decimal that results from converting this Double.

We conclude the following:

  • Round to 15 digits by ToString() and round to 15 digits by conversion to Decimal disagree in very many cases
  • Conversion to Decimal also rounds incorrectly in many cases, and the errors in these cases cannot be described as "round-twice" errors
  • In my cases, ToString() seems to yield a bigger number than Decimal conversion when they disagree (no matter which of the two rounds correctly)

I only experimented with cases like the above. I haven't checked if there are rounding errors with numbers of other "forms".

Community
  • 1
  • 1
Jeppe Stig Nielsen
  • 60,409
  • 11
  • 110
  • 181
  • OK, that's all nice and detailed, but...what is your question? – flq Jun 18 '12 at 14:36
  • 3
    It's in bold in the middle... – David M Jun 18 '12 at 14:42
  • You asked if its a bug, its not a bug, they are not exact values. – Security Hound Jun 18 '12 at 14:50
  • 4
    @Ramhound If you read the question carefully, you will see that I am completely aware of the precision. I know that not every number is exactly representable. My question is: Is this not a bug in the `ToString()` method? We can all agree that if `ToString()` returned `"-42.8"` on this number, it would be a bug, even if `Double`s are "not exact values" (your words). So `ToString()` might have a bug, even if the precision of a floating-point number is not unlimited. – Jeppe Stig Nielsen Jun 18 '12 at 14:56
  • Just incorporated this link into my answer, but look at the very bottom of this page and it describes the exact same behavior: http://msdn.microsoft.com/en-us/library/kfsatb94.aspx (it even calls it double rounding. – nicholas Jun 18 '12 at 18:45
  • +1, added to favorites, paging @EricLippert.... – Matthew Jun 18 '12 at 22:53

5 Answers5

9

So from your experiments, it appears that Double.ToString doesn't do correct rounding.

That's rather unfortunate, but not particularly surprising: doing correct rounding for binary to decimal conversions is nontrivial, and also potentially quite slow, requiring multiprecision arithmetic in corner cases. See David Gay's dtoa.c code here for one example of what's involved in correctly-rounded double-to-string and string-to-double conversion. (Python currently uses a variant of this code for its float-to-string and string-to-float conversions.)

Even the current IEEE 754 standard for floating-point arithmetic recommends, but doesn't require that conversions from binary floating-point types to decimal strings are always correctly rounded. Here's a snippet, from section 5.12.2, "External decimal character sequences representing finite numbers".

There might be an implementation-defined limit on the number of significant digits that can be converted with correct rounding to and from supported binary formats. That limit, H, shall be such that H ≥ M+3 and it should be that H is unbounded.

Here M is defined as the maximum of Pmin(bf) over all supported binary formats bf, and since Pmin(float64) is defined as 17 and .NET supports the float64 format via the Double type, M should be at least 17 on .NET. In short, this means that if .NET were to follow the standard, it would be providing correctly rounded string conversions up to at least 20 significant digits. So it looks as though the .NET Double doesn't meet this standard.

In answer to the 'Is this a bug' question, much as I'd like it to be a bug, there really doesn't seem to be any claim of accuracy or IEEE 754 conformance anywhere that I can find in the number formatting documentation for .NET. So it might be considered undesirable, but I'd have a hard time calling it an actual bug.


EDIT: Jeppe Stig Nielsen points out that the System.Double page on MSDN states that

Double complies with the IEC 60559:1989 (IEEE 754) standard for binary floating-point arithmetic.

It's not clear to me exactly what this statement of compliance is supposed to cover, but even for the older 1985 version of IEEE 754, the string conversion described seems to violate the binary-to-decimal requirements of that standard.

Given that, I'll happily upgrade my assessment to 'possible bug'.

Mark Dickinson
  • 29,088
  • 9
  • 83
  • 120
  • +1 for a good exploratory answer... to the point though you note that is isn't a bug... I agree but to the extent that the MSDN documentation is misleading on the behavior (wouldn't be the first time).... I think this reinforces what we already know, though: that if you want significand precision of 15 digits (per the OP) retained during math operations then neither `double` nor `float` are appropriate types. – Matthew Jun 18 '12 at 17:43
  • Very informative. On the MSDN site, they mention IEEE on the [System.Double page](http://msdn.microsoft.com/en-us/library/system.double.aspx) ("Double complies with the IEC 60559:1989 (IEEE 754) standard for binary floating-point arithmetic.") and on the [C# double page](http://msdn.microsoft.com/en-us/library/678hzkk9.aspx), so one could think that it was their intention to comply with IEEE with the `ToString` as well. – Jeppe Stig Nielsen Jun 18 '12 at 17:49
  • Ah, nice find. That's the older version of IEEE 754 (from 1985) they're claiming to comply with. I don't have a copy of that, but I do recall that the requirements for float-to-string conversions were somewhat weaker there, so it's conceivable that their version of ToString complies with that. Sorry for not being able to offer more here. – Mark Dickinson Jun 18 '12 at 18:22
  • 4
    @Jeppe Stig Nielsen: Just found a copy of the 1985 version; even there, if I'm reading it correctly, they specify a range within which binary to decimal (i.e., string) conversions should be correctly rounded, and your example falls within that range. – Mark Dickinson Jun 18 '12 at 18:28
  • I'll accept this answer. Note that I just edited my question to supply more examples. Conversion to `Decimal` rounds incorrectly too! – Jeppe Stig Nielsen Jun 26 '12 at 16:07
6

First take a look at the bottom of this page which shows a very similar 'double rounding' problem.

Checking the binary / hex representation of the following floating point numbers shows that that the given range is stored as the same number in double format:

31.0000000000000480 = 0x403f00000000000e
31.0000000000000497 = 0x403f00000000000e
31.0000000000000515 = 0x403f00000000000e

As noted by several others, that is because the closest representable double has an exact value of 31.00000000000004973799150320701301097869873046875.

There are an additional two aspects to consider in the forward and reverse conversion of IEEE 754 to strings, especially in the .NET environment.

First (I cannot find a primary source) from Wikipedia we have:

If a decimal string with at most 15 significant decimal is converted to IEEE 754 double precision and then converted back to the same number of significant decimal, then the final string should match the original; and if an IEEE 754 double precision is converted to a decimal string with at least 17 significant decimal and then converted back to double, then the final number must match the original.

Therefore, regarding compliance with the standard, converting a string 31.0000000000000497 to double will not necessarily be the same when converted back to string (too many decimal places given).

The second consideration is that unless the double to string conversion has 17 significant digits, it's rounding behavior is not explicitly defined in the standard either.

Furthermore, documentation on Double.ToString() shows that it is governed by numeric format specifier of the current culture settings.

Possible Complete Explanation:

I suspect the twice-rounding is occurring something like this: the initial decimal string is created to 16 or 17 significant digits because that is the required precision for "round trip" conversion giving an intermediate result of 31.00000000000005 or 31.000000000000050. Then due to default culture settings, the result is rounded to 15 significant digits, 31.00000000000001, because 15 decimal significant digits is the minimum precision for all doubles.

Doing an intermediate conversion to Decimal on the other hand, avoids this problem in a different way: it truncates to 15 significant digits directly.

nicholas
  • 2,969
  • 20
  • 39
  • 2
    OK, so the number internally given by `0x403f00000000000e` (your notation) can be said to represent a whole interval (of length `2 ** -48`, you will agree). Your examples prove that this interval contains both numbers belonging to `"31"` and numbers belonging to `"31.0000000000001"`. So **in your opinion** it should be "undefined behavior" which of those two strings `ToString()` chooses? In **my opinion** it should use the midpoint of that interval, and the midpoint belongs to `"31"`. – Jeppe Stig Nielsen Jun 18 '12 at 15:36
  • Well mathematically and intuitively the interval midpoint would be a logical choice; however, it would likely be more computationally expensive. Determining the upper and lower bounds in decimal representation would require twice as many conversions. From a mathematical perspective, there is no reason to chose ceiling vs floor vs midpoint. In this case midpoint makes the most since because you know the intended decimal representation a priori, but there are an equal number of cases where rounding up would provide the "correct" answer. – nicholas Jun 18 '12 at 16:24
  • 3
    @nicholas: The overwhelmingly dominant IEEE 754 standard for floating-point is very much based on the model that individual floating-point numbers represent exact values rather than ranges: the 'correct' result of an operation is the closest floating-point number to the mathematical operation computed with those exact input values. In this case, the *exact* value is the one that the OP gave, and with correct rounding, rounding to 15 digits should give `31.0`, not `31.0...01`. Unfortunately, it seems that `ToString` fails to do correct rounding here. – Mark Dickinson Jun 18 '12 at 16:42
  • (deleted two non-constructive comments) – Mark Dickinson Jun 18 '12 at 18:17
  • To your new link in the beginning of your answer: I agree this is really related. Good find! Note this is "community content", not documentation from Microsoft themselves. Specifically, saying `evil.ToString("F13")` also has the same problem (gives `"31.0000000000001"`). On the other hand, saying `Math.Round(evil, 13)` correctly returns `31` exactly. – Jeppe Stig Nielsen Jun 18 '12 at 21:14
  • To your comment _there are an equal number of cases where rounding up would provide the "correct" answer_: Hmm, but this is not consistent. For example, for the number whose "interval" in this sense spans from `"800"` to `"800.000000000001"`, the `ToString()` representation is (correctly) `"800"`, but this does not correspond to the upper end point of the "interval" belonging to this number. – Jeppe Stig Nielsen Jun 18 '12 at 21:18
  • But, alas, `Math.Round` uses another midpoint rounding than the `ToString` overloads, and indeed, calling `Math.Round(evil, 13, MidpointRounding.AwayFromZero)` also has the problem, so this needs a little more "research". – Jeppe Stig Nielsen Jun 18 '12 at 22:06
  • 1
    Floating point numbers are not intervals. They are numbers. "The string representation of 31.0000000000001" is **not** "sometimes correct", because 0x403f00000000000e is always precisely equal to 31.00000000000004973799150320701301097869873046875, and never anything else. – Stephen Canon Jun 19 '12 at 13:42
  • Revised after reconsideration following comments from others. – nicholas Jun 19 '12 at 16:47
  • @StephenCanon: For some purposes it is useful to have a language and floating-point spec which would guarantee that running the same application on multiple platforms will yield bit-identical results. For such applications, a result which is 0.50000000001LSB from the arithmetically perfect result is simply wrong. Many other applications will be happier with a quickly-computed result that's within a few parts per million of being correct, than with a slowly-computed result that's bit-perfect. It's too bad languages and frameworks don't let applications... – supercat Jun 23 '14 at 14:30
  • @StephenCanon: ...specify whether they need precisely-defined semantics or would prefer faster execution. IMHO, floating-point method specifications should say what degree of accuracy they'll provide [if they guarantee absolutely-precise rounding, the specs should expressly *say* so]. If a method doesn't specify a precise degree of accuracy, my "default" expectation would be that the result should be within +/- 1/2 lsb of what the value would be for some combination of values operands +/- 1/2 lsb of the values passed in. The cost of implementing precision beyond that... – supercat Jun 23 '14 at 14:35
  • ...often goes up considerably [compare the time to perform a precisely-rounded division between two `floats` with the time to cast `float` to `double`, multiply by a pre-computed `double` reciprocal, and cast back], and for *most* applications such extra precision would offer insufficient benefit to justify the extra time requied. – supercat Jun 23 '14 at 14:37
  • @nicholas: No, Convert.ToDecimal(double) does not *truncate*, rather it *rounds to nearest*. That being said: truncation is the correct way to limit the precision of a number that will later be rounded, precisely to avoid the double rounding issue. – Jan Heldal Nov 12 '20 at 10:34
1

The question: Isn't this a bug?

Yes. See this PR on GitHub. The reason of rounding twice AFAK is for "pretty" format output but it introduces a bug as you have already discovered here. We tried to fix it - remove the 15 digits precision converting, directly go to 17 digits precision converting. The bad news is it's a breaking change and will break things a lot. For example, one of the test case will break:

10:12:26 Assert.Equal() Failure 10:12:26 Expected: 1.1 10:12:26 Actual: 1.1000000000000001

The fix would impact a large set of existing libraries so finally this PR has been closed for now. However, .NET Core team is still looking for a chance to fix this bug. Welcome to join the discussion.

Jim Ma
  • 709
  • 5
  • 15
0

Truncation is the correct way to limit the precision of a number that will later be rounded, precisely to avoid the double rounding issue.

Jan Heldal
  • 148
  • 6
-1

I have a simpler suspicion: The culprit is likely the pow operator => **; While your number is exactly representable as a double, for convenience reasons (the power operator needs much work to work right) the power is calculated by the exponential function. This is one reason that you can optimize performance by multiplying a number repeatedly instead of using pow() because pow() is very expensive.

So it does not give you the correct 2^48, but something slightly incorrect and therefore you have your rounding problems. Please check out what 2^48 exactly returns.

EDIT: Sorry, I did only a scan on the problem and give a wrong suspicion. There is a known issue with double rounding on the Intel processors. Older code use the internal 80-bit format of the FPU instead of the SSE instructions which is likely to cause the error. The value is written exactly to the 80bit register and then rounded twice, so Jeppe has already found and neatly explained the problem.

Is it a bug ? Well, the processor is doing everything right, it is simply the problem that the Intel FPU internally has more precision for floating-point operations.

FURTHER EDIT AND INFORMATION: The "double rounding" is a known issue and explicitly mentioned in "Handbook of Floating-Point Arithmetic" by Jean-Michel Muller et. al. in the chapter "The Need for a Revision" under "3.3.1 A typical problem : 'double rounding'" at page 75:

The processor being used may offer an internal precision that is wider than the precision of the variables of the program (a typical example is the double-extended format available on Intel Platforms, when the variables of the program are single- precision or double-precision floating-point numbers). This may sometimes have strange side effects , as we will see in this section. Consider the C program [...]

#include <stdio.h>

int main(void) 
{
  double a = 1848874847.0;
  double b = 19954562207.0;
  double c;
  c = a * b;
  printf("c = %20.19e\n", c);
  return 0;
}

32bit: GCC 4.1.2 20061115 on Linux/Debian

With Compilerswitch or with -mfpmath=387 (80bit-FPU): 3.6893488147419103232e+19 -march=pentium4 -mfpmath=sse (SSE) oder 64-bit : 3.6893488147419111424e+19

As explained in the book, the solution for the discrepancy is the double rounding with 80 bits and 53 bits.

Thorsten S.
  • 4,144
  • 27
  • 41
  • This cannot be the case because he uses the same resolved `double` for each representation. Review the sample code provided. – Matthew Jun 18 '12 at 20:14
  • 1
    I didn't actually calculate `2` to the `48`th power. I just mentioned the "mathematical correct" fraction for the number in question, just to make sure everyone knew that I understood the way a 64-bit floating-point number works. The way I entered my number to .NET, was through the C# code line `const double evil = 31.0000000000000497;`. – Jeppe Stig Nielsen Jun 18 '12 at 20:22
  • Do you have a link or more canonical source for your answer, that this is a known issued with Intel CPUs? – Matthew Jun 19 '12 at 22:32
  • The question doesn't suffer from any sort of floating point rounding issues. It starts out with an *exact* floating point value, and then converts it to a decimal. In all likeliness, there is not a single floating point operation involved in this process. This answer doesn't address the question. – IInspectable Jun 25 '20 at 16:40