Fixed-point number format specifier rounding of doubles?

Question

I have a pretty decent understanding of IEEE 754 so this is not one of those "why does adding number a and number b result in..."-type of questions.

Rather I want to ask if I've understood the fixed-point number-format specifier correctly because it's not behaving as I would expect for some double values.

For example:

double d = 0x3FffffFFFFfffe * (1.0 / 0x3FffffFFFFffff);
Console.WriteLine(d.ToString("R"));
Console.WriteLine(d.ToString("G20"));
Console.WriteLine(d.ToString("F20"));

Both the "R" and "G" specifier prints out the same thing - the correct value of: 0.99999999999999989 but the "F" specifier always rounds up to 1.0 no matter how many decimals I tell it to include. Even if I tell it to print the maximum number of 99 decimals ("F99") it still only outputs "1."-followed by 99 zeroes.

So is my understanding broken, and can someone point me to the relevant section in the spec, or is this behavior broken? (It's no deal-breaker for me, I just want to know.)

Here is what I've looked at, but I see nothing explaining this.

(This is .Net4.0)

score 0 · Answer 1 · edited May 23 '17 at 11:44

0

User "w.b" linked to another question which I suspect has the best possible answer available for this question... although there was some fuzzy-ness on the details and documentation. (Unfortunately the comment was removed.)

The linked question was Formatting doubles for output in C#.

And in short it appears to state that unless you use the G or R specifier the output is always reduced to 15 decimal-places before custom formatting. Best doc anyone was able to link to in that question was this MSDN page. The details and wording isn't as crystal clear as I would wish, but as stated I think this is the best I'm going to find.

edited May 23 '17 at 11:44

Community

1
1

answered Sep 27 '15 at 01:11

AnorZaken

1,916
1
22
34

Two comments that aren't very clear from the links. 1) The math is being done in base 2 so a conversion from decimal to binary and back to decimal occurs which affects the the rounding of the results. 2) The math can be perform using either the Floating Point Hardware inside the PC's microprocessor or by Simulation in the C# compiler. So sometimes you get a different answer depending if you are using the executable in the debug folder or using the executable in the release folder. – jdweng Sep 27 '15 at 02:41
@jdweng Wouldn't that last part you mentioned be an implementation detail of JIT? Anyway if you could link to any specification or other documentation about that I would love to read it. :) – AnorZaken Sep 27 '15 at 15:03
If my own memory serves me right, the spec allows intermediate calculations to be performed in a higher precision. In other words, as long as the value is still in a cpu register it could have a higher than mandated precision. (If written back to memory the extra precision is lost ofc.) This isn't always desirable therefore you can use the cast-syntax to truncate the extra precision by casting to the same type. In other words this code: `float x = /*some calculation*/; bool isIdentical = x == (float)x;` might not always return true as one might expect... _(if I remember the spec correctly)_. – AnorZaken Sep 27 '15 at 15:19
The debug compilation just give better visibility to variables at a cost of slower execution. The chip in the PC is very complicated and not all connections in the chip are accessible during debug. The compiler simulates the connection during debug. The connection to the Floating Point Unit in the chip is one of these items that is sometimes simulated (or partially simulated). For example the underflow, overflow, divide by zero are exceptions in the processor in the release executable. During debug you may not want the exception to occur and instead examine the status register. – jdweng Sep 27 '15 at 15:27

Fixed-point number format specifier rounding of doubles?

1 Answers1