0

Try this:

(float)100008009

And you will probably get

100008008

The issue is that we get no warning. And this can't be overflow since floats can take higher values. So I can't explain this result.

What is the Max value for 'float'?

profimedica
  • 2,716
  • 31
  • 41
  • `float` (`Single`) has **24** bit *mantissa* which means that `float` can represent *exactly* numbers up to `2^24` (`16_777_216`). Since `100_008_009 > 16_777_216` you have rounding error – Dmitry Bychenko Sep 09 '21 at 20:45
  • 2
    @DmitryBychenko: `float` can represent much larger numbers exactly. For example, it can represent 100,008,016 exactly. – Eric Postpischil Sep 09 '21 at 20:47
  • @EricPostpischil You're looking at significant digits *in base 10*, but what matters is significant digits *in base 2*. 100,008,016 has 24 significant digits in base 2 (within the limit) 100008009 has 27 significant digits in base 2 (over the limit). Sometimes people will tell you how many significant digits you have in base 10, but that's just an approximation, what actually matters is the significant digits in base 2. – Servy Sep 09 '21 at 20:52
  • What everyone is saying is that when you use floats, you can't expect (or expect to be warned) that you will get exact numbers. Think about it, a float is a 32-bit quantity, with about 4 billion distinct values. However it can represent numbers up to 10 to the 38th. You pay for that range in precision – Flydog57 Sep 09 '21 at 20:52
  • @Servy: Per the IEEE-754 specification, the single format (IEEE-754 binary32) represents 100,008,016 exactly. Decimal digits are irrelevant; the representation is defined as a mathematical expression, not a decimal numeral. – Eric Postpischil Sep 09 '21 at 20:53
  • 2
    @Eric Postpischil: `float` can represent all numbers up to `16_777_216` exactly, all *even* (divisible by `2`) numbers up to `33_554_432` exactly, all divisible by `4` numbers up to `67_108_864`, all divsisble by `8` up to `134_217_728` etc. Since `100_008_016 < 134_217_728` and `100_008_016` is divisble by `8` it can be respresented exactly – Dmitry Bychenko Sep 09 '21 at 20:53
  • @EricPostpischil Yes, 100,008,016 can be represented exactly as a float. It has 24 significant digits in binary. That's not the number the OP asked about. They asked about 100008009, which *cannot* be represented exactly by a float. – Servy Sep 09 '21 at 20:55
  • @DmitryBychenko: `float` can represent all **integers**, not all numbers, up to 16,777,216. – Eric Postpischil Sep 09 '21 at 20:58
  • @Servy: Re “That's not the number the OP asked about.”: I was not writing about the number OP asked about, I was writing about the misleading statement that “`float` can represent *exactly* numbers up to 2^24.” `float` can in fact represent many larger numbers exactly. – Eric Postpischil Sep 09 '21 at 20:58

1 Answers1

2

The issue is that we get no warning.

Floating-point is intended to approximate real-number arithmetic. So rounding during conversion is part of the design, meaning it is normal, so it does not get a warning. The closest value to 100008009 representable in float is 100008008, so that is the result.

Eric Postpischil
  • 195,579
  • 13
  • 168
  • 312
  • I'm not convinced that between 100008008 and 100008009 there are no valid float vlues. Why not 100008008.1 or 100008008.0001 ? – profimedica Sep 09 '21 at 21:19
  • I think I got it. As the numbers grows the precision decreases so much that the gap between values exceed one, so we will gave larger gaps between decimal representations of successive float values. – profimedica Sep 09 '21 at 21:27
  • @profimedica: Yes, there are larger gaps between successive representable values. The decimal representation is irrelevant; how far apart two numbers are is not affected by whether you write them in decimal or binary. – Eric Postpischil Sep 09 '21 at 21:28
  • @profimedica: By definition, a finite number in `float` format is ±f•2^e, where f is a 24-bit unsigned integer and e is in a certain range. (An equivalent definition makes f a 24-bit binary number with one bit before the “binary point” and 23 bits after it, and the range of e is adjusted by 23 to match.) 100008009 is not representable in the `float` format because there is no 24-bit number f that satisfies this; 100008009 in binary has more than 24 bits from its first 1 to its last 1, and so do all numbers between 100008008 and 100008009. – Eric Postpischil Sep 09 '21 at 21:30