25

I have read this - Why Are Floating Point Numbers Inaccurate?

So, sometimes precision can be changed in floating point number because of its representation style (scientific notation with an exponent and a mantissa).

But if I cast an integer value to double, is there any chance to change the precision of double slightly in Java?

I mean,

    int i = 3;
    double d = (double) i;

    System.out.println(d);

the output I got 3.0 as I expected.

but, is there any chance of being precision changed like 3.000001 because of representation style of double in Java?

Community
  • 1
  • 1
Uzzal Podder
  • 2,925
  • 23
  • 26

2 Answers2

33

Not for int to double, but you might for long to double (or int to float):

  • Not all longs greater than 2^53-1 (or less than -2^53) can be represented exactly by a double;
  • Not all ints greater than 2^24-1 (or less than -2^24) can be represented exactly by a float.

This restriction comes about because of the number of bits used to represent the mantissa (53 in doubles, 24 in floats).

Andy Turner
  • 137,514
  • 11
  • 162
  • 243
  • Also easily checkable with: `int i = Integer.MAX_VALUE; float f = (float) i; double d = (double) i;` which will have `f != d` evaluate to true as the float can not correctly represent the value. – Ben Jun 06 '18 at 08:58
  • 4
    @Ben such a simple check might be misleading: you might have happened to pick `Integer.MIN_VALUE` to test this out, and you would have found that `Integer.MIN_VALUE` *can* be exactly represented as a float and as a double. It just so happens that it works for `Integer.MAX_VALUE`. – Andy Turner Jun 06 '18 at 09:00
  • That's correct. Probably not an optimal example. Just found it worth mentioning as it is a "quick check" proving the point. – Ben Jun 06 '18 at 09:02
  • @Ben No, it doesn't prove anything. You can also show that 2^34 fits in a float that way, even if it doesn't fit in an int. – Mr Lister Jun 06 '18 at 10:37
  • 1
    The "quick check" approach can be resurrected by always testing a number and its successor. _Iff_ there is an overflow problem in the tested range, only one of them can be represented exactly. For a more explicit check, try something along these lines: `long num = (long) Math.pow(2,53); System.out.println(num == (long) (double) num); System.out.println((num+1) == (long) (double) (num+1)); System.out.println(num == (long) (double) (num+1));` – arne.b Jun 06 '18 at 11:01
  • 3
    It only takes one exception to prove a "rule" to be false. Just because `Integer.MAX_VALUE` 'just' happens to work doesn't make it unreliable for proving the case – Baldrickk Jun 06 '18 at 12:59
  • 1
    Note: 2^24 and 2^53 are representable (in both float and double), The first unrepresentable integer in a float is 2^24+1 and the first unrepresentable integer in double is 2^53+1 – plugwash Jun 06 '18 at 15:16
  • Also note that 53 and 24 are the effective sizes of the mantissa, the actual size is one bit less because IEEE floating point doesn't bother to store the leading 1. – plugwash Jun 06 '18 at 15:17
  • @AndyTurner Checking P(X) doesn't check ~P(~X) always; if Ben's check *fails*, there are integers that cannot be represented. If Ben's check *succeeds*, it doesn't tell you anything intereating. – Yakk - Adam Nevraumont Jun 06 '18 at 15:42
  • @Yakk-AdamNevraumont the point that I didn't really articulate well was that if you can't remember what these representable ranges are, testing one number may mislead you if you just check one (or a handful of) numbers. You may just be (un)lucky, and pick exactly representable numbers. – Andy Turner Jun 06 '18 at 15:45
  • 1
    What you can do is check odd numbers, if an odd number is representable than all smaller (closer to zero) odd numbers are as well. – plugwash Jun 06 '18 at 21:00
5

You can iterate over i until you find a 2**i double which is equal to 2**i + 1:

import java.util.stream.IntStream;

public class PrecisionLoss
{
    public static void main(String[] args) {
        double epsilon = 1;
        Integer maxInt = IntStream.iterate(0, i -> i + 1)
                .filter(i -> Math.pow(2, i) == Math.pow(2, i) + epsilon)
                .findFirst().getAsInt();
        System.out.println("Loss of precision is greater than " + epsilon
                + " for 2**" + maxInt + " when using double.");
    }
}

It outputs:

Loss of precision is greater than 1.0 for 2**53 when using double.

Which confirms the accepted answer.

Note that in Javascript, there is no integer type and doubles are used instead (they are called Numbers). If they are large enough, consecutive Numbers can be equal to each other.

Eric Duminil
  • 52,989
  • 9
  • 71
  • 124