7

I have some numbers I'm trying to compare. They represent lengths of paths through different spaces.

Unfortunately for me, some imprecision was causing false comparisons. For instance, after noting the wrong effects, I found that I was having comparisons like this:

a = 384.527100541296
b = 384.52710054129614 // Note the trailing 14 

For my purposes, a and b are supposed to be equal.

I noted the guava has a fuzzyCompare() method for doubles which seems to do what I want in ignoring some of this precision:

private static final double COMPARISON_PRECISION=1e-10;

private static final Comparator<Double> fuzzyCompare= new Comparator<Double>(){
    public int compare(Double o1, Double o2) {
        return DoubleMath.fuzzyCompare(o1, o2, COMPARISON_PRECISION);
    }   
};

public int compareTo(Object o) {
    if (o instanceof Spam) {
       Spam other = (Spam) (o);
       return ComparisonChain.start()
       .compare(this.getLength(),other.getLength(),fuzzyCompare)
       //...
       .result();
    } else {
       throw new ClassCastException();
    }
}

The warning on that fuzzy compare did not slip my notice:

This is not a total ordering and is not suitable for use in Comparable.compareTo(T) implementations. In particular, it is not transitive

My question is, is this lack of transitivity a real problem? If it is, how would it present itself? I would think that if the comparison were really truly violated it would throw an error similar to this question: Java error: Comparison method violates its general contract, and its not doing that even against a variety of values I've tested.

Or perhaps since an IllegalArgumentException is a runtime error, I just haven't run across the problems yet because only some devious values will trigger the problem?

Or perhaps it is doing something wrong right now, it just subtle enough that I haven't noticed it?

Raedwald
  • 46,613
  • 43
  • 151
  • 237
code11
  • 1,986
  • 5
  • 29
  • 37
  • It is more subtle. Actually it depends on the initial sequence of your `List` to be sorted, whether `Collection.sort(List, Comparator)` will throw the [Java error: Comparison method violates its general contract](https://stackoverflow.com/questions/11441666/java-error-comparison-method-violates-its-general-contract) or not. – Thomas Fritsch Oct 30 '17 at 17:14

3 Answers3

6

TL;DR:

Your operator is not transitive. Consider a = 0, b = 0.6, c = 1.2 with a tolerance of 1. a==b, b==c but a!=c. The solution is to partition your values into classes (for example by rounding or truncating) and use Double.compare() to preserve transitivity.

Detailed explanation:

First lets discuss if your data is transitive while using fuzzyCompare(double, double, double):

While in most cases your data will be transitive, it is possible to generate samples which are not. Lets take the following values:

a = 384.52710054120
b = 384.52710054126
c = 384.52710054132

As you can see that using our new metric the following are true: a==b, b==c, but a!=c. As you can see, you have violated transitivity.

Is it a problem if your Comparator is non-transitive?

Methods assert certain conditions by using documentation and/or annotations. The compare method promises that the method is transitive. Breaking that promise might be ok for a lot of cases were transitivity is not important, but code which relies on that promise could be broken.

What is an example of code which might not work if the promise of transitivity is broken?

Lets create a scenario where we have 3 elements of type Foo which are not transitive according to some Comparator called fooComparator . We call them f1, f2 and f3.

Comparator<Foo> fooComparator = new Comparator<Foo>(){
    public int compare(Foo o1, Foo o2) {
        // some non-transitive return value
    }   
};

Since they are not transitive, lets assume f0 < f1, f1 < f2, f2 < f0 holds true. What would happen if you put them in a list and tried to sort() them?

List<Foo> foos = new LinkedList<>();
Collections.addAll(f1, f2, f3)
Collections.sort(foos, fooComparator);

How to fix the problem

You can create a transitive operator by mapping data to another data set and use a transitive operator defined on that set. Lets map the real numbers to the real numbers with a lower precision.

Consider the following values:

a = 0.01; b = 0.05; c = 0.13; d = 0.19; e = 0.21

If you truncate them to the second digit (Math.truncate(x * 10)/10) and use Double.compare(), transitivity is preserved.

You can see that we have put our values into three classes {a, b} < {c, d} < {e}. There surely is some important theorem which proves that this is the case, but I can't remember its name..

Community
  • 1
  • 1
Neuron
  • 5,141
  • 5
  • 38
  • 59
  • I suppose that in order for `f2 – code11 Oct 30 '17 at 15:33
  • well, if your data is transitive within the boundaries of your application: yes. you can ignore that warning. the warning is still very important, as that might not be the case for every application – Neuron Oct 30 '17 at 15:41
  • i have to admit i didn't get your question at the first shot. my answer is still: yes, its a problem. give me a minute to edit my answer – Neuron Oct 30 '17 at 15:48
  • i fixed my answer. look at the last secion i added – Neuron Oct 30 '17 at 15:54
  • 1
    Well. That is a problem. I hadn't thought of that. I suppose I'll have to track down whats causing the imprecision in the first place. – code11 Oct 30 '17 at 15:57
  • why dont you just round your numbers to the tenth digit and use the common [Double.compare()](https://docs.oracle.com/javase/7/docs/api/java/lang/Double.html#compareTo(java.lang.Double)) – Neuron Oct 30 '17 at 16:03
  • Wouldn't that be essentially the same problem? Choosing what rounding method would be important I think. – code11 Oct 30 '17 at 16:10
  • no, just truncate it at a certain precision. transitivity will be maintained. check it with the examples provided – Neuron Oct 30 '17 at 16:11
  • i added an example to the bottom of my answer – Neuron Oct 30 '17 at 16:16
1

is this lack of transitivity a real problem

Maybe, depends on the problem you're trying to solve. But you might run into subtle problems where code expecting implementations of Comparator to work in a transitive way. It is hard to say what the effects are, other than "undefined".

I would not be very happy if I saw this code in a review: you are overloading Java's well-defined concept of comparison with your own - valid, but different - notion of comparison.

If you call it something different - fuzzyCompare, FuzzyComparator etc - there is no confusion of the two notions.

Andy Turner
  • 137,514
  • 11
  • 162
  • 243
  • I guess I was trying to understand what the effects might be. You're saying that there is no general "bad effect" like a specific error, but that just the violation itself opens the code up to a whole world of "undefined" behavior? – code11 Oct 30 '17 at 15:29
  • 1
    @code11 exactly. There is nothing to say that it won't work, just you shouldn't be surprised if it doesn't. – Andy Turner Oct 30 '17 at 16:07
1

Using a non-transitive compareTo is a terrible idea:

  • Sorting may throw as already pointed.
  • Even worse, sorting may return wrong results, possibly completely wrong. It relies on the contract which you're violating and there's absolutely no guarantee that it ends up well (i.e., with an exception). Analyzing TimSort won't help, as the algorithm may get replaced by a better one in a few years.
  • Any SortedMap may break terribly. It may happen that things you put it, won't be found (such things do really happen with a HashMap with a broken equals or hashCode). Again, the implementation may change in a few years and then anything is possible.

I'd strongly suggest to name your method differently. Or, create a Comparator documented with a corresponding warning (this can lead to the same kind of problems, but it's much more explicit).

Note that with a broken Comparable, even a HashMap may break as in case of many collisions, it uses compareTo when possible.

maaartinus
  • 44,714
  • 32
  • 161
  • 320