0

base-2: why DECIMAL_DIG - DIG == 2 or 3?

Examples:

  • FLT_DECIMAL_DIG - FLT_DIG == 9 - 6 == 3
  • DBL_DECIMAL_DIG - DBL_DIG == 17 - 15 == 2
  • LDBL_DECIMAL_DIG - LDBL_DIG == 21 - 18 == 3
  • FLT16_DECIMAL_DIG - FLT16_DIG == 5 - 3 == 2

Extra: Is it guaranteed that on any given implementation using base-2 this difference will be 2 or 3?

pmor
  • 5,392
  • 4
  • 17
  • 36
  • probably related: [Why do I need 17 significant digits (and not 16) to represent a double?](https://stackoverflow.com/q/6118231/995714), [17 Digits Gets You There, Once You’ve Found Your Way](https://www.exploringbinary.com/17-digits-gets-you-there-once-youve-found-your-way/) – phuclv Oct 05 '21 at 11:48
  • 4
    2 <= ceil(1+p*log10(2)) - floor((p-1)*log10(2)) <= 3, mainly because log10(2) is an irrational number between 0 and 1, and p is a positive integer. – Ian Abbott Oct 05 '21 at 12:41

2 Answers2

1

Yes, for base-2 radix, xxx_DECIMAL_DIG - xxx_DIG will be 2 or 3.

Informal proof

For a floating point type:

  • b is the base or radix or exponent representation (an integer > 1)
  • p is the precision (the number of base-b digits in the significand) (I assume this is > 1 and rational)

The xxx_DECIMAL_DIG values are defined as:

p log10 b — if b is a power of 10
⎱ ⌈1 + p log10 b⌉ — otherwise

The xxx_DIG values are defined as:

p log10 b — if b is a power of 10
⎱ ⌊(p - 1) log10 b⌋ — otherwise

For b = 2, log10 b ≈ 0.301 and is irrational (proof).
p log10 2 is irrational (since p is rational and > 1).
∴ ⌈p log10 2⌉ - ⌊p log10 2⌋ = 1.
∴ ⌈1 + p log10 2⌉ - ⌊p log10 2⌋ = 2. — ①
p log10 2⌋ - ⌊(p - 1) log10 2⌋ ∈ {0, 1}. — ② (since 0 < log10 2 < 1)
∴ ⌈1 + p log10 2⌉ - ⌊(p - 1) log10 2⌋ ∈ {2, 3}. — (from ① and ②) ∎

Ian Abbott
  • 15,083
  • 19
  • 33
  • FYI: Here is an interesting [DR](https://www.open-std.org/jtc1/sc22/wg14/www/docs/summary.htm#dr_501) discussing whether `DECIMAL_DIG` can be larger than necessary. – pmor Sep 06 '22 at 19:44
1

Given definitions of xxx_DECIMAL_DIG and xxx_DIG we have
2 <= ceil(1+p*log10(2)) - floor((p-1)*log10(2)) <= 3
Simplifying:

2 <=     ceil(1+p*log10(2)) - floor((p-1)*log10(2)) <= 3
2 <= 1 + ceil(  p*log10(2)) - floor((p-1)*log10(2)) <= 3
let q = p – 1
2 <= 1 + ceil(  (1 + q)   *log10(2)) - floor((p-1)*log10(2)) <= 3
2 <= 1 + ceil(log10(2) + q*log10(2)) - floor(  q  *log10(2)) <= 3
let R = trunc(q*log10(2))
let r = trunc(q*log10(2)) – R, r is [0.0 …1.0)
log10(2) = 0.301…
2 <= (1 + R + ceil(0.301… + r)) – (R + floor(r)) <= 3
2 <=  1     + ceil(0.301… + r)       – floor(r)  <= 3
2 <=  1     + ceil(0.301… + r)       – 0         <= 3
When r <= 1.0 - 0.301…
  2 <= 1 + 1 <= 3
Otherwise r > 1.0 - 0.301…
  2 <= 1 + 2 <= 3

Even though it is shown that the difference is 2 or 3, I suppose the next questions are why are
xxx_DECIMAL_DIG: ceil(1+p*log10(2))
and
xxx_DIG: floor((p-1)*log10(2))?

chux - Reinstate Monica
  • 143,097
  • 13
  • 135
  • 256
  • FYI: Here is an interesting [DR](https://www.open-std.org/jtc1/sc22/wg14/www/docs/summary.htm#dr_501) discussing whether `DECIMAL_DIG` can be larger than necessary. – pmor Sep 06 '22 at 19:44