4

Why does a single-precision floating point number have 7 digit precision (or double 15-16 digits precision)?

Can anyone please explain how we arrive on that based on the 32 bits assigned for float(Sign(32) Exponent(30-23), Fraction (22-0))?

phuclv
  • 37,963
  • 15
  • 156
  • 475
avulosunda
  • 137
  • 2
  • 14

2 Answers2

10

23 fraction bits (22-0) of the significand appear in the memory format but the total precision is actually 24 bits since we assume there is a leading 1. This is equivalent to log10(2^24) ≈ 7.225 decimal digits.

Double-precision float has 52 bits in fraction, plus the leading 1 is 53. Therefore a double can hold log10(2^53) ≈ 15.955 decimal digits, not quite 16.

Note: The leading 1 is not a sign bit. It is actually (-1)^sign * 1.ffffffff * 2^(eeee-constant) but we need not store the leading 1 in the fraction. The sign bit must still be stored


There are some numbers that cannot be represented as a sum of powers of 2, such as 1/9:

>>>> double d = 0.111111111111111;
>>>> System.out.println(d + "\n" + d*10);
0.111111111111111
1.1111111111111098

If a financial program were to do this calculation over and over without self-correcting, there would eventually be discrepancies.

>>>> double d = 0.111111111111111;
>>>> double sum = 0;
>>>> for(int i=0; i<1000000000; i++) {sum+=d;}
>>>> System.out.println(sum);
111111108.91914201

After 1 billion summations, we are missing over $2.

Ron
  • 1,450
  • 15
  • 27
  • 1
    The leading 1 is not a sign bit. It is actually `(-1)^sign * 1.ffffffff * 2^(eeee-constant)` but we need not store the leading 1 in the fraction. The sign bit must still be stored – Ron Oct 02 '13 at 05:24
  • 1
    I have seen in some palces they mention the precision of float (15 - 16 ). Ever **15.955** will be **16**? – avulosunda Oct 02 '13 at 05:38
  • 3
    @jb_2519 As Ron shows, double-precision floating points have 15.955 *decimal* digits of precision. That means that you can rely pretty well on the first 15 *decimal* digits being accurate, with any following digits being only partially representable at best. Personally I wouldn't rely on anything past the 14th (or 6th in single-precision) decimal digit being accurate. – Corey Oct 02 '13 at 05:52
  • You should never rely on floating-point variables (any precision) to hold exact values. Use `BigDecimal` if you must be sure. – Ron Oct 02 '13 at 07:00
  • 1
    @RonE Why we are taking log base 10 to calculate the no. of decimal digits? Can you explain me this concept? – Pankaj Mahato Aug 11 '14 at 17:05
  • 2
    @PankajMahato That's just how you calculate it. For example, if we want to represent the number 2^24 in base 10, it is 16777216. Since log10(2^24) = 7.225, we can see that this should be a leading digit followed by 7 more. In reverse, if we want to see what the smallest binary number is that has 8 following digits in decimal, we calculate the following: log2(10^8)= 26.58. Therefore we need a 27 bit binary number to get decimal number that has a leading digit followed by 8 more (9 digits total). Keep in mind that 10^8 is a 1 followed by 8 zeros, for a total of 9 digits. – Ron Aug 14 '14 at 20:43
  • @Ron, Thx. Very informatory explanation on why we must take log. – Erdogan CEVHER May 21 '19 at 14:11
2

32 float has 23 bit,so the smallest unit is

2^(-23) = 0.00000011920928955078125

The other numbers are only greater than 0.00000011920928955078125.It's not impossible less than 0.00000011920928955078125.And other numbers is consist of 0.00000011920928955078125

0.00000011920928955078125 * n

So we can express 0.00000x[1-9] easily.And float32 can has 6 digit precision certainly.Don't think about roundoff, we can calculate 7 digit number as bellow:

0.00000011920928955078125 * 1 = 0.0000001
0.00000011920928955078125 * 2 = 0.0000002
0.00000011920928955078125 * 3 = 0.0000003
0.00000011920928955078125 * 4 = 0.0000004
0.00000011920928955078125 * 5 = 0.0000005
0.00000011920928955078125 * 6 = 0.0000007
0.00000011920928955078125 * 7 = 0.0000008
0.00000011920928955078125 * 8 = 0.0000009
0.00000011920928955078125 * 9 = 0.000001

It can't express 0.0000006.This is the result float32 has 6~7 digit precision which we can find in the internet everywhere.

上山老人
  • 412
  • 4
  • 12