1

This is some more clarification to the question that was already answered some time ago here: biggest integer that can be stored in a double

The top answer mentions that "the largest integer such that it and all smaller integers can be stored in IEEE 64-bit doubles without losing precision. An IEEE 64-bit double has 52 bits of mantissa, so I think it's 2^53:

because:

  • 253 + 1 cannot be stored, because the 1 at the start and the 1 at the end have too many zeros in between.

  • Anything less than 253 can be stored, with 52 bits explicitly stored in the mantissa, and then the exponent in effect giving you another one.

  • 253 obviously can be stored, since it's a small power of 2.

Can someone clarify the first point? What does he mean by that? is he talking about for example if it were a 4 bit number 1000 + 0001, you can't store that in 4 bits? 253 is just the first bit 1 and the rest 0's right? how come you can't add a 1 to that without losing precision?

also, "The largest integer such that it and all smaller integers can be stored in IEEE". Is there some general rule such that if I wanted to find the largest n bit integer such that it and all smaller integers can be stored in IEEE, could I simply say that it is 2n? example if I were to find the largest 4 bit integers such that it and all integer below it can be represented, it would be 2^4?

Community
  • 1
  • 1
john
  • 61
  • 8

1 Answers1

0

is he talking about for example if it were a 4 bit number 1000 + 0001, you can't store that in 4 bits?

No, he is saying that you can't store that in 3 bits. Using the usual binary notation.

253 is just the first bit 1 and the rest 0's right?

Yes, and so are 1, 2, 4, …, 253, 254, 255, …, 2123, 2124, … and also 0.125.

This is floating-point we are talking about. 253 is just an implicit 1 with all explicit significand bits 0, yes, but it is not the only number with this property. The crucial property is that the ULP for representing 253 is 2. So 253 can be represented as all powers of two that are in range, and 253+1 cannot because the ULP is too large in that neighborhood.

also, "The largest integer such that it and all smaller integers can be stored in IEEE". Is there some general rule such that if I wanted to find the largest n bit integer such that it and all smaller integers can be stored in IEEE, could I simply say that it is 2n?

Yes, in binary IEEE 754 floating-point, all “largest integer such that it and all smaller integers can be stored” are powers of two, and specifically 2n where n is the significand's width (counting the implicit bit).

Pascal Cuoq
  • 79,187
  • 7
  • 161
  • 281