Comparing doubles properly using aliasing with integer representations and ULPs

Question

I have tried to avoid epsilon comparisons for comparing floating point types. The best solution I could come up with used the difference in ULPs (Unit in the last place), though this article had a much better solution using integer representations (/// indicate my own comments):

/* See
https://randomascii.wordpress.com/2012/01/11/tricks-with-the-floating-point-format/
for the potential portability problems with the union and bit-fields below.
*/

#include <stdint.h> // For int32_t, etc.

union Float_t
{
    Float_t(float num = 0.0f) : f(num) {}
    // Portable extraction of components.
    bool Negative() const { return i < 0; }
    int32_t RawMantissa() const { return i & ((1 << 23) - 1); }
    int32_t RawExponent() const { return (i >> 23) & 0xFF; }

    int32_t i; /// Perhaps overflow when using doubles?
    float f;

    #ifdef _DEBUG
    struct
    {   // Bitfields for exploration. Do not use in production code.
        uint32_t mantissa : 23; /// 52 for double?
        uint32_t exponent : 8; /// 11 for double?
        uint32_t sign : 1;
    } parts;
    #endif
};

bool AlmostEqualUlps(float A, float B, int maxUlpsDiff)
{
    Float_t uA(A);
    Float_t uB(B);

    // Different signs means they do not match.
    if (uA.Negative() != uB.Negative())
    {
        // Check for equality to make sure +0==-0
        if (A == B)
            return true;
        return false;
    }

    // Find the difference in ULPs.
    int ulpsDiff = abs(uA.i - uB.i);
    if (ulpsDiff <= maxUlpsDiff)
        return true;

    return false;
}

However, I can't seem to reformat the code in such a way that it supports doubles. I even read up on the explanation, found here.

Does anyone know what would be the best way to tackle this?

Before anyone decides to mark this as a duplicate: don't, because the only question that was similar was meant for javascript, and the C++ answer was:

bool IsAlmostEqual(double A, double B)
{
    //http://www.cygnus-software.com/papers/comparingfloats/comparingfloats.htm
    long long aInt = reinterpret_cast<long long&>(A);
    if (aInt < 0) aInt = -9223372036854775808LL - aInt;
    long long bInt = reinterpret_cast<long long&>(B);
    if (bInt < 0) bInt = -9223372036854775808LL - bInt;
    return (std::abs(aInt - bInt) <= 10000);
}

Which doesn't use ULPs, but some kind of bound, and I'm not sure what -9223372036854775808LL - aInt is at all (perhaps where int64 overflows).

I suggest that you try with the modifications that you mentioned in your comments, they seem logical. Let us see then where it goes wrong and try to fix it. — A.S.H, Dec 18 '16 at 23:06
@A.S.H applying my comments, I get the error: `Error C2034 'double_t::mantissa': type of bit field too small for number of bits`. So then I set the mantissa as a `uint64_t`, and all goes well, but the sign is incorrectly `1`, while I compare `1.0 with 1.0`. The hex value of `1.0` seems correct: `0x3ff0000000000000`, but it's held by the exponent?? So I set the both the mantissa and the exponent as `uint64_t` and it seems to work. Am I still missing something? — Nikita, Dec 19 '16 at 02:52
All of them should be uint64_t, since they represent bit fields of a 64-bit` double`. I guess you changed also the types of `f` and `i` to double and int64_t respectively, the difintions of `RawMantissa` and `RawExponent`, etc. — A.S.H, Dec 19 '16 at 03:10
Changing the sign to `uint64_t` seems to work just as well when comparing `1.0 from multiplication` with `1.0`. A value of `0.099999999999999992` seems to produce integer `4591870180066957721`, mantissa `2702159776422297`, and exponent `1019`. A value of `0.10000000000000001` seems to produce integer `4591870180066957722`, mantissa `2702159776422298`, and exponent `1019`. The difference would thus be `1 ULP`. Does this seem correct? — Nikita, Dec 19 '16 at 03:20

Yan Zhou · Accepted Answer · 2016-12-19T05:35:56.720

1

I don't think your code work at all. Below is just one example where it is going to get it wrong. (This is not an answer, but the explanation is too long to fit into a comment)

int main()
{
    Float_t x;
    Float_t y;
    x.i = 0x7F800000;
    y.i = 0x7F7FFFFF;
    AlmostEqualUlps(x.f, y.f, 1); // return true
}

However, x.f is actually infinity and y.f is FLT_MAX. The difference by any definition is infinity. Unless this is your intended behavior, that is consider a finite number and infinity being almost equal. Your implementation of ULP is completely off the mark. In fact for the two numbers above, ULP is not even well defined.

Another example will be 0x7F800000 (or any number close to this, depending on your maxULP) and 0x7F800001 (NaN). Unlike the example above, there's not even an argument to made that they should be consider "almost equal".

On the other hand, you reject any pairs with different signs as not close enough, while in truth there are a lot subnormals between -FLT_MIN and FLT_MIN, which one might considered to be "almost equal". For example, 0x80000001 and 0x1 differ by 2ULP, but if you set maxULP to 2 in your function, it will return false.

If you can rule out denormals, infintities, NaNs, then to deal with double you just need to replace uint32_t with uint64_t as mentioned in others' comment.

edited Dec 19 '16 at 05:35

answered Dec 19 '16 at 05:13

Yan Zhou

2,709
2
22
37

I found the values for [IEE 754-1985](https://en.wikipedia.org/wiki/IEEE_754-1985#Examples), which seem different for [IEE 754-2008](https://en.wikipedia.org/wiki/Single-precision_floating-point_format#Single-precision_examples). Though I would imagine vc++ uses the latter and I should just ignore the previous one, right? – Nikita Dec 19 '16 at 11:10
@Byrk It's not about the difference in the two standards. It' s your failure to handle special values in IEEE floating points. Numbers with exponents set to all ones but significant zero are infinity, non-zeron are NaN. Numbers with exponents set to zero is subnormals. Normal numbers has an implicit 1 in MSB of significant. 80-bit long double in x86 has no implicit 1 while some other platforms have a different long double. These are things you need to consider when calculating ULP – Yan Zhou Dec 19 '16 at 16:13
On the topic of comparing floats, it was noted that as you get closed to zero `FLT_EPSILON`, defined in `float.h`, becomes useful, would this be an easy way of handling the exceptions, or would I have to do individual checks for `NaN` etc? – Nikita Dec 19 '16 at 23:13
@Byrk FLT_EPSILON means the difference between 1 and the next representable number. In other words, for numbers far smaller than 1 in magnitude (don't even need to be subnormal), it is a really big error while for numbers far larger than 1 it is an inmeasurable small value. And again, you are not dealing with special cases at all. Floating points are not equally spaced, that's why we need ULP for measuring of accuracy. I suggest you read the classic first http://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html – Yan Zhou Dec 19 '16 at 23:18
That is a good read, and I would suggest looking at [this question](http://stackoverflow.com/questions/2249110/how-do-i-make-a-portable-isnan-isinf-function) for implementing the concepts. – Nikita Dec 25 '16 at 06:56

Comparing doubles properly using aliasing with integer representations and ULPs

1 Answers1