0

On a 32-bit system, I found that the operation below always return the correct value when a < 2^31 but returns random results where a is larger.

uint64_t a = 14227959735;
uint64_t b = 32768;
float c = 256.0;
uint64_t d = a - b/ c; // d returns 14227959808

I believe the problem here is that the int-to-float operation returns undefined behavior, but could someone help explain why it gives such a value?

Pang
  • 9,564
  • 146
  • 81
  • 122
Jackie.Y
  • 65
  • 1
  • 6
  • floating point calc only accurate to 7 digits http://stackoverflow.com/questions/9765744/precision-in-c-floats – pm100 Dec 21 '16 at 01:38
  • 1
    There's nothing random or undefined, and the result you're getting is very nearly correct (off by one part in 100 million). It's just a rounding error. If you used `double` rather than `float` you'd get the exact result, though you could still get rounding errors for larger input values. – Keith Thompson Dec 21 '16 at 01:45
  • There are a large number of questions of which this could be a duplicate; the one I chose has the merit of being one of the oldest on SO (and it has excellent answers). Incidentally, the result would be the same on a 64-bit system; the problem is that `float` (almost always) uses 4 bytes which doesn't give very many decimal digits of precision. – Jonathan Leffler Dec 21 '16 at 01:48

1 Answers1

2

The entire calculation goes to float, then gets cast to a 64 bit integer. But floats can't accurately represent large integers, unless they happen to be powers of two.

Malcolm McLean
  • 6,258
  • 1
  • 17
  • 18