Output of strtoull() loses precision when converted to double and then back to uint64_t

Question

Consider the following:

#include <iostream>
#include <cstdint>

int main() {
   std::cout << std::hex
      << "0x" << std::strtoull("0xFFFFFFFFFFFFFFFF",0,16) << std::endl
      << "0x" << uint64_t(double(std::strtoull("0xFFFFFFFFFFFFFFFF",0,16))) << std::endl
      << "0x" << uint64_t(double(uint64_t(0xFFFFFFFFFFFFFFFF))) << std::endl;
   return 0;
}

Which prints:

0xffffffffffffffff
0x0
0xffffffffffffffff

The first number is just the result of converting ULLONG_MAX, from a string to a uint64_t, which works as expected.

However, if I cast the result to double and then back to uint64_t, then it prints 0, the second number.

Normally, I would attribute this to the precision inaccuracy of floats, but what further puzzles me, is that if I cast the ULLONG_MAX from uint64_t to double and then back to uint64_t, the result is correct (third number).

Why the discrepancy between the second and the third result?

EDIT (by @Radoslaw Cybulski) For another what-is-going-on-here try this code:

#include <iostream>
#include <cstdint>
using namespace std;

int main() {
    uint64_t z1 = std::strtoull("0xFFFFFFFFFFFFFFFF",0,16);
    uint64_t z2 = 0xFFFFFFFFFFFFFFFFull;
    std::cout << z1 << " " << uint64_t(double(z1)) << "\n";
    std::cout << z2 << " " << uint64_t(double(z2)) << "\n";
    return 0;
}

which happily prints:

18446744073709551615 0
18446744073709551615 18446744073709551615

A clue that this is undefined behavior: In a local test, with `g++` version 6.3, the behavior varied based on whether or not I passed optimization flags. When I passed `-O1`, `-O2` or `-O3`, I match your behavior. When I don't pass an optimization flag (or pass `-O0` explicitly), the result of both round trip casts is `0` (checking assembly, only `-O0` actually performs the casts for `z2`; optimization skips them on the basis of the standard stating that any case where the cast would make a difference is undefined behavior, per [eerorika's answer](https://stackoverflow.com/a/57113390/364696)). — ShadowRanger, Jul 19 '19 at 13:52
Clarifying the results of checking the assembly (of Radoslaw's code): At `-O1` and higher, `z2` effectively never exists; the immediate value of `0xffffffffffffffff` is directly loaded into the argument registers immediately before it's printed, w/o ever being stored in a dedicated register or stack location. It's only at `-O0` (which avoids optimizations that would interfere with debugging; it tries to preserve the correspondence between lines of code and the associated assembly) that it bothers to make a stack location for `z2`, load from it on each use, perform the casts on the value, etc. — ShadowRanger, Jul 19 '19 at 14:00

score 10 · Accepted Answer · edited Jun 20 '20 at 09:12

10

The number that is closest to 0xFFFFFFFFFFFFFFFF, and is representable by double (assuming 64 bit IEEE) is 18446744073709551616. You'll find that this is a bigger number than 0xFFFFFFFFFFFFFFFF. As such, the number is outside the representable range of uint64_t.

Of the conversion back to integer, the standard says (quoting latest draft):

[conv.fpint]

A prvalue of a floating-point type can be converted to a prvalue of an integer type. The conversion truncates; that is, the fractional part is discarded. The behavior is undefined if the truncated value cannot be represented in the destination type.

Why the discrepancy between the second and the third result?

Because the behaviour of the program is undefined.

Although it is mostly pointless to analyse reasons for differences in UB because the scope of variation is limitless, my guess at the reason for the discrepancy in this case is that in one case the value is compile time constant, while in the other there is a call to a library function that is invoked at runtime.

edited Jun 20 '20 at 09:12

Community

1
1

answered Jul 19 '19 at 13:21

eerorika

232,697
12
197
326

I wonder, why does the conversion aim for the larger closest integer instead of the smaller one? – Stack Danny Jul 19 '19 at 13:25
3

@StackDanny Probably because the larger double is closer than the smaller. The result will depend on current rounding mode. – eerorika Jul 19 '19 at 13:27
4

Not going to post another answer just for this, but you might want to mention that, for standard IEEE 754 double precision binary floating point, it only has 53 bits of integer level precision. `UINT64_C(1) << 53` is the last contiguous integer value that converts to `double` and back losslessly; any odd value above that limit will round (and eventually, so will values not divisible by 4, 8, 16, etc. as you rely more and more on the exponent to scale the integer component up). – ShadowRanger Jul 19 '19 at 13:38

Output of strtoull() loses precision when converted to double and then back to uint64_t

1 Answers1