7

Background: Floating point numbers have rounding issues, so they should never be compared with "==".

Question: In Java, how do I test whether a list of Double contains a particular value. I am aware of various workarounds, but I am looking for the most elegant solution, presumably the ones that make use of Java or 3rd party library features.

import java.util.ArrayList;
import java.util.List;

public class Test {
    public static void main(String[] args) {
        // should be 1.38, but end up with 1.3800000000000001
        Double d1 = new Double(1.37 + 0.01);
        System.out.println("d1=" + d1);

        // won't be able to test the element for containment
        List<Double> list = new ArrayList<Double>();
        list.add(d1);
        System.out.println(list.contains(1.38));
    }
}

Output is:

d1=1.3800000000000001
false

Thank you.

Skyline
  • 145
  • 2
  • 11
  • 6
    There is no elegant math library solution in core Java (that I know of) -- you either have to roll your own using a decent test that epsilon is less than some minimum value, or use a 3rd party library. Speaking of which, I do believe that the Apache library has something along these lines. – Hovercraft Full Of Eels May 17 '14 at 00:23
  • Possible duplicate [comparing float/double values using == operator](http://stackoverflow.com/questions/6837007/comparing-float-double-values-using-operator?rq=1) – Jeff Ward May 17 '14 at 00:25
  • Following from @HovercraftFullOfEels, you can simply write your own method to check if a list contains a value by looping over the collection. It will be equally efficient and the implementation is straightforward. – Whymarrh May 17 '14 at 00:25
  • Good to know. I modified the question to include 3rd party libraries. – Skyline May 17 '14 at 00:25
  • @Whymarrh: I'm not so sure of how simple. Are you recommending that he use ratios of `Math.abs(observed - expected) / expected` or no ratio when calculating epsilon? – Hovercraft Full Of Eels May 17 '14 at 00:27
  • @HovercraftFullOfEels I wasn't really referring to the comparison itself, more that OP should write the `contains` method. – Whymarrh May 17 '14 at 00:30
  • @Whymarrh: but the key here really **is** the comparison itself. His issue is with floating point numbers, not iterating through a collection. As for my question -- I recommend using the ratio. – Hovercraft Full Of Eels May 17 '14 at 00:34
  • 1
    The Apache commons method mentioned by Hovercraft: [`ArrayUtils.contains()`](http://commons.apache.org/proper/commons-lang/javadocs/api-3.3.2/org/apache/commons/lang3/ArrayUtils.html#contains%28double[],%20double,%20double%29) – Petr Janeček May 17 '14 at 00:38
  • This is another possible solution found here on SO: http://stackoverflow.com/questions/179427/how-to-resolve-a-java-rounding-double-issue. From what I gather, it doesn't eliminate rounding errors, and division can be tricky, but it is a partial solution. – Mark W May 17 '14 at 00:56
  • 4
    Floating point numbers _can_ be compared with `==`, so it's incorrect to say _never_. But you have to understand what you're doing, and bear in mind that it's arithmetic operations that introduce the rounding errors, not comparisons. – Dawood ibn Kareem May 17 '14 at 01:27
  • Where's the obligatory link to "what every computer scientist should know about floating-point arithmetic"? – tmyklebu May 17 '14 at 03:22
  • After 3 years, I found this article to be quite informative, and solves this question (to some extent): http://bitbashing.io/comparing-floats.html – Skyline Mar 30 '17 at 22:54

4 Answers4

5

The general solution would be to write a utility method that loops through the list and checks if each element is within a certain a threshold of the target value. We can do slightly better in Java 8, though, using Stream#anyMatch():

list.stream().anyMatch(d -> (Math.abs(d/d1 - 1) < threshold))

Note that I am using the equality test suggested here.

If you're not using Java 8, I would write a simple utility method along the lines of this:

public static boolean contains(Collection<Double> collection, double key) {
    for (double d : collection) {
        if (Math.abs(d/key - 1) < threshold)
            return true;
    }
    return false;
}

Note that you might need to add a special case to both of these approaches to check if the list contains 0 (or use the abs(x - y) < eps approach). That would just consist of adding || (abs(x) < eps && abs(y) < eps) to the end of the equality conditions.

Community
  • 1
  • 1
arshajii
  • 127,459
  • 24
  • 238
  • 287
  • isn't this doing unecessary autoboxing/unboxing? – rds May 17 '14 at 00:55
  • @rds How would you avoid it? – arshajii May 17 '14 at 00:56
  • with unelegant solutions such as a custom function http://www.java2s.com/Code/Java/Collections-Data-Structure/Twodoublevaluearraysarealmostequal.htm – rds May 17 '14 at 00:58
  • @rds But the OP has `List`, not a `double[]`. – arshajii May 17 '14 at 00:59
  • 1
    @rds If he had a `double[]`, he could use `Arrays.stream()` to create a [`DoubleStream`](http://docs.oracle.com/javase/8/docs/api/java/util/stream/DoubleStream.html) which also has `anyMatch()`. – ajb May 17 '14 at 01:07
  • 1
    Right! Sorry. Also, the code needs to be adjusted it you want to check whether the list contains 0 – rds May 17 '14 at 01:18
  • @rds That's true; I made a note of that. – arshajii May 17 '14 at 01:20
  • This won't work if you're looking for zero. I've read your note - but appending something like `|| (abs(x) < eps)` to the end won't get rid of the division by zero error that has already occurred by the time this code is evaluated. – Dawood ibn Kareem May 17 '14 at 01:30
  • `NaN` does not compare equal to anything. First, `0.0/0.0` is `NaN`. Second, what if you're looking for `NaN`? – tmyklebu May 17 '14 at 03:21
  • 1
    @tmyklebu The OP can add the `isnan` checks as well if he requires them; the overall format of the solution (which is the point of this answer) will not change. – arshajii May 17 '14 at 03:28
  • Sorry, you're right. I retract my earlier comment. Of course, it's always a good idea to include zero as a test case when you test code like this. – Dawood ibn Kareem May 17 '14 at 03:34
3

Comparing bits wasn't a good idea. Similar to another post, but deals with NaN and Infinities.

import java.util.ArrayList;
import java.util.List;

public class Test {


    public static void main(String[] args) {
        // should be 1.38, but end up with 1.3800000000000001
        Double d1 = 1.37d + 0.01d;
        System.out.println("d1=" + d1);

        // won't be able to test the element for containment
        List<Double> list = new ArrayList<>();
        list.add(d1);

        System.out.println(list.contains(1.38));
        System.out.println(contains(list, 1.38d, 0.00000001d));
    }

    public static boolean contains(List<Double> list, double value, double precision) {
        for (int i = 0, sz = list.size(); i < sz; i++) {
            double d = list.get(i);
            if (d == value || Math.abs(d - value) < precision) {
                return true;
            }
        }
        return false;
    }
}
1

You could wrap the Double in another class that provides a 'close enough' aspect to its equals method.

package com.michaelt.so.doub;

import java.util.HashSet;
import java.util.Set;

public class CloseEnough {
    private Double d;
    protected long masked;
    protected Set<Long> similar;

    public CloseEnough(Double d) {
        this.d = d;
        long bits = Double.doubleToLongBits(d);
        similar = new HashSet<Long>();
        masked = bits & 0xFFFFFFFFFFFFFFF8L; // 111...1000
        similar.add(bits);
        similar.add(bits + 1);
        similar.add(bits - 1);
    }

    Double getD() {
        return d;
    }

    @Override
    public boolean equals(Object o) {
        if (this == o) {
            return true;
        }
        if (!(o instanceof CloseEnough)) {
            return false;
        }

        CloseEnough that = (CloseEnough) o;

        for(Long bits : this.similar) {
            if(that.similar.contains(bits)) { return true; }
        }
        return false;
    }

    @Override
    public int hashCode() {
        return (int) (masked ^ (masked >>> 32));
    }
}

And then some code to demonstrate it:

package com.michaelt.so.doub;

import java.util.ArrayList;
import java.util.List;

public class Main {
    public static void main(String[] args) {
        List<CloseEnough> foo = new ArrayList<CloseEnough>();
        foo.add(new CloseEnough(1.38));
        foo.add(new CloseEnough(0.02));
        foo.add(new CloseEnough(1.40));
        foo.add(new CloseEnough(0.20));
        System.out.println(foo.contains(new CloseEnough(0.0)));
        System.out.println(foo.contains(new CloseEnough(1.37 + 0.01)));
        System.out.println(foo.contains(new CloseEnough(0.01 + 0.01)));
        System.out.println(foo.contains(new CloseEnough(1.39 + 0.01)));
        System.out.println(foo.contains(new CloseEnough(0.19 + 0.01)));
    }
}

The output of this code is:

false
true
true
true
true

(that first false is the compare with 0, just to show that it isn't finding things that aren't there)

CloseEnough is just a simple wrapper around the double that masks the lowest three bits for the hash code (enough that and also stores the valid set of similar numbers in a set. When doing an equals comparison, it uses the sets. Two numbers are equal if they contain a common element in their sets.

That said, I am fairly certain that there are some values that would be problematic with a.equals(b) being true and a.hashCode() == b.hashCode() being false that may still occur at edge conditions for the proper bit patterns - this would make some things (like HashSet and HashMap) 'unhappy' (and would likely make a good question somewhere.


Probably a better approach to this would instead be to extend ArrayList so that the indexOf method handles the similarity between the numbers:

package com.michaelt.so.doub;

import java.util.ArrayList;

public class SimilarList extends ArrayList<Double> {

    @Override
    public int indexOf(Object o) {
        if (o == null) {
            for (int i = 0; i < this.size(); i++) {
                if (get(i) == null) {
                    return i;
                }
            }
        } else {
            for (int i = 0; i < this.size(); i++) {
                if (almostEquals((Double)o, this.get(i))) {
                    return i;
                }
            }
        }
        return -1;
    }

    private boolean almostEquals(Double a, Double b) {
        long abits = Double.doubleToLongBits(a);
        long bbits = Double.doubleToLongBits(b);

        // Handle +0 == -0
        if((abits >> 63) != (bbits >> 63)) {
            return a.equals(b);
        }

        long diff = Math.abs(abits - bbits);

        if(diff <= 1) {
            return true;
        }

        return false;
    }
}

Working with this code becomes a bit easier (pun not intended):

package com.michaelt.so.doub;

import java.util.ArrayList;

public class ListTest {
    public static void main(String[] args) {
        ArrayList foo = new SimilarList();

        foo.add(1.38);
        foo.add(1.40);
        foo.add(0.02);
        foo.add(0.20);

        System.out.println(foo.contains(0.0));
        System.out.println(foo.contains(1.37 + 0.01));
        System.out.println(foo.contains(1.39 + 0.01));
        System.out.println(foo.contains(0.19 + 0.01));
        System.out.println(foo.contains(0.01 + 0.01));
    }
}

The output of this code is:

false
true
true
true
true

In this case, the bit fiddling is done in the SimilarList based on the code HasMinimalDifference. Again, the numbers are converted into bits, but this time the math is done in the comparison rather than trying to work with the equality and hash code of a wrapper object.

  • That "works" for `1.37` and `0.01`, but it "fails" for `1.4` and `0.2`. – tmyklebu May 17 '14 at 03:19
  • @tmyklebu yes, and is a problem with all bit masking approaches - they will miss a `01` being the same distance away from `10` as `00`. If the mask is changed to `0xf...c` or `0xf...8`, there is a larger range, but still the edge conditions. However, trying to make `equals()` into a similar function using math means that the hashCode contract is broken (point #2 in http://docs.oracle.com/javase/7/docs/api/java/lang/Object.html#hashCode() ) - so that isn't an acceptable solution either. Another option would be to extend the collection and its lookup, but thats getting very involved –  May 17 '14 at 03:30
  • No; the real issue here is that the problem you're trying to solve hasn't been properly framed. If you're screwing around masking out bits of `double`s before comparing them for "almost-equality", that almost certainly means your code doesn't work to begin with and you're just trying to put lipstick on the pig. – tmyklebu May 17 '14 at 03:34
0

Background: Floating point numbers have rounding issues, so they should never be compared with "==".

This is false. You do need to be awake when writing floating-point code, but reasoning about the errors that are possible in your program is in many cases straightforward. If you cannot do this, it behooves you to at least get an empirical estimate of how wrong your computed values are and think about whether the errors you're seeing are acceptably small.

What this means is that you cannot avoid getting your hands dirty and thinking about what your program is doing. If you're going to work with approximate comparisons, you need to have some idea of what differences mean that two values really are different and what differences mean that two quantities might be the same.

// should be 1.38, but end up with 1.3800000000000001

This is also false. Notice that the closest double to 1.37 is 0x1.5eb851eb851ecp+0 and the closest double to 0.01 is 0x1.47ae147ae147bp-7. When you add these, you get 0x1.6147ae147ae14f6p+0, which rounds to 0x1.6147ae147ae15p+0. The closest double to 1.38 is 0x1.6147ae147ae14p+0.

There are several reasons why two slightly different doubles do not compare ==. Here are two:

  • If they did so, that would break transitivity. (There would be a, b, and c such that a == b and b == c but !(a == c).
  • If they did so, carefully-written numerical code would stop working.

The real issue with trying to find a double in a list is that NaN does not compare == to itself. You may try using a loop that checks needle == haystack[i] || needle != needle && haystack[i] != haystack[i].

tmyklebu
  • 13,915
  • 3
  • 28
  • 57
  • @Skyline: You cannot do what you want to do without first figuring out what you want to do. Here, that means figuring out what's wrong with the `Double`s in your list. – tmyklebu May 19 '14 at 02:28
  • Appreciate your clarification for the audience. But unless you suggest the question is meaningless, I stand by my words. So could you append your answer with some possible solutions? – Skyline May 19 '14 at 05:02
  • I'm sorry to say that your comment was neither constructive nor polite. All I requested was for you to provide an suggestion. Following your example, you should probably add more comments like "you should never go physical against your partner. However, you should try another approach when communicating with your partner such as blah blah.". Wouldn't it be more constructive and informative than just stating that the question is wrong? – Skyline May 20 '14 at 00:41