1

There are a lot of great answers on this site that explain that doing simple arithmetic on C# doubles with a decimal portion can lead to serious double equality issues whereby the results of two calculations that are clearly algebraically equivalent are seen as unequal when compared for equality, even when only addition, subtraction, and multiplication (and no division) of double objects is done, but is there still the possibility of serious double equality issues if there are two arithmetic expressions of doubles where all of the doubles have only integral parts and no decimal parts?

Namely, does arithmetic (excluding division) on C# doubles with only integral parts produce results that also have only integral parts such that two such expressions like 3.0d + (2.0d * 3.0d * 5.0d) - 2.0d and 2.0d + (10.0d * 3.0d) - 1.0d, which are algebraically equal, can be safely compared with an if (... == ...) condition, or could such arithmetic expressions of an arbitrary set of double literals, including those with really large values, somehow evaluate to slightly different numbers, including the possibility of the two results having different integral parts, so long as the results and their intermediate calculations stay between -9,007,199,254,740,992 and 9,007,199,254,740,992, which are the min and max in C# beyond which every 64-bit integer can no longer be represented by a double?

traderdev
  • 13
  • 2
  • Will leave answering for someone who can include any relevant references, but the answer is "there is no risk of inaccuracy if there is no decimal portion to the values". Floating point inaccuracy only applies to values after the decimal point, so if you're sticking to integrals no inaccuracy will be introduced. –  May 11 '21 at 23:38
  • @CallumMorrisson I was not aware of that, so would love to see a citation (not saying you are wrong). – mjwills May 12 '21 at 01:15
  • 1
    https://stackoverflow.com/a/4290922/34092 may be useful @traderdev. – mjwills May 12 '21 at 01:20
  • @mjwills interesting, so my comment is inaccurate beyond a certain scale (52 bit number). I wasn't aware of that either. Thanks! –  May 12 '21 at 02:33
  • @CallumMorrison I suspect it is also complicated by the fact that double calculations sometimes use more than 64-bits. ;) https://stackoverflow.com/questions/64546440/how-does-c-sharp-implicitly-cast-terms-of-integral-types-to-terms-of-double – mjwills May 12 '21 at 03:40

1 Answers1

0

Not necessarily.

Machine Precision

Floating point numbers are represented internally (per IEEE 754) as follows

x = s*(1 + m*2^(-52))*2^(k-1023)

where m and k are integers, 52 bits for m and 11 bits for k. s is the sign bit and it is either -1 or +1. For the remainder of this post, I am ignoring this as I am assuming x is positive.

The smallest change you can detect for a value x is

eps(x) = 2^(floor(log(x,2))-52)

For example, x=40000000 is

40000000 = (1+865109492629504*2^(-52))*2^(1048-1023)
         = (1+0.192092895507812)*2^25
         = (1.192092895507812)*33554432

and the next largest number has +1 in m

next = (1+865109492629505*2^(-52))*2^(1048-1023)
     = (1+0.19209289550781272)*2^25
     = 40000000.000000007

The precision of x (equals to the numeric distance to next) is

eps = 2^(floor(log(x,2))-52)
    = 2^(25-52)
    = 2^(-27)
    = 0.000000007450580596923828125

So any double value being non-fractional or not, will have a certain precision associated with it that gets worse with each calculation.

This means that if you do 134217728 calculations with x above, you will end up with an uncertainty of >1.0. This is not so far-fetched. A double loop of 11,600 iterations each and you get there.

The limiting example is that of x=9007199254740992.0

The precision of this number is eps(x)=2.0 so adding 1.0 to x does not produce the expected result.

You should use the eps(x) function above to decide how close two numbers are in terms of their precision.

Addendum

The composition/decomposition of a number can be examined with the following code:

    static double Compose(byte s, ulong m, short k)
    {
        return s*(1.0+ m*Math.Pow(2, -52))*Math.Pow(2, k-1023);
    }

    static void Decompose(double x, out byte s, out ulong m, out short k)
    {
        s = (byte)Math.Sign(x);
        x = Math.Abs(x);
        k = (short)(Math.Log(x, 2)+1023);
        var z = x/Math.Pow(2, k-1023)-1;
        m = (ulong)(4503599627370496*z);
    }
    static double Eps(double x)
    {
        x = Math.Abs(x);
        if (x==0) return Eps(1.0);
        return Math.Pow(2, Math.Floor(Math.Log(x, 2))-52);
    }

I don't understand why C# uses double.Epsilon = 2^(-2074) and does not have a built-in function for Eps(x) like Julia and other languages have.

John Alexiou
  • 28,472
  • 11
  • 77
  • 133
  • `Compose` / `Decompose` might be better implemented via `BitConverter.Int64BitsToDouble` / `BitConverter.DoubleToInt64Bits` & bitwise operations. – Jeremy Lakeman May 12 '21 at 02:22
  • @JeremyLakeman - it is just an illustration of the process. A cartoon of the encoding of floating-point numbers. – John Alexiou May 12 '21 at 02:52
  • 1
    I marked this as the (excellent) answer. I lack the reputation to upvote answers, so I'll just say thank you to you for solving my problem in great detail and to @mjwills for the links to the other helpful threads that further discussed the challenges with doubles that I am currently getting my head around. – traderdev May 12 '21 at 22:10