Time for getting by comparable key from HashMap

Question

AFAIK since java 8 bucket structure was changed from linked list to tree.

So in case hashCode() method return constant and our key class implements Comparable interface then complexity to get element from single cell will be reduced from O(n) to O(log n).

I tried to check it:

 public static void main(String[] args) {
    int max = 30000;
    int times = 100;
    HashMap<MyClass, Integer> map = new HashMap<>();
    HashMap<MyComparableClass, Integer> compMap = new HashMap<>();

    for(int i = 0; i < max; i++) {
        map.put(new MyClass(i), i);
        compMap.put(new MyComparableClass(i), i);
    }

    long startTime = System.nanoTime();
    for (int i = max; i > max - times; i--){
        compMap.get(new MyComparableClass(i));
    }
    System.out.println(String.format("Key is comparable:    %d", System.nanoTime() - startTime));

    startTime = System.nanoTime();
    for (int i = max; i > max - times; i--){
        map.get(new MyClass(i));
    }
    System.out.println(String.format("Key isn't comparable: %d", System.nanoTime() - startTime));
}

MyComparableClass:

public class MyComparableClass
        implements Comparable {
    public Integer value;

    public MyComparableClass(Integer value) {
        this.value = value;
    }

    @Override
    public int compareTo(Object o) {
        return this.value - ((MyComparableClass) o).value;
    }

    @Override
    public boolean equals(Object o) {
        if (this == o) return true;
        if (o == null || getClass() != o.getClass()) return false;

        MyComparableClass myClass = (MyComparableClass) o;

        return value != null ? value.equals(myClass.value) : myClass.value == null;
    }

    @Override
    public int hashCode() {
        return 1;
    }
}

MyClass is the same as MyComparableClass but doesn't implement Comparable interface.

And unexpectedly i always get results where time for getting value by non comparable key less than by comparable.

Key is comparable:    23380708
Key isn't comparable: 10721718

Can someone explain?

(1) Avoid using raw types: `public class MyComparableClass implements Comparable` ( 2) your benchmark is flawed (not enough iterations, no warm up etc.) Read: https://stackoverflow.com/questions/504103/how-do-i-write-a-correct-micro-benchmark-in-java. — assylias, Nov 13 '17 at 11:32
@Eran MyClass is the same as MyComparableClass but doesn't implement Comparable interface. — sann05, Nov 13 '17 at 11:55
@assylias Seemingly generic `compareTo` method **wasn't** used at all. Explicitly setting class helps a lot. — sann05, Nov 13 '17 at 12:14

score 5 · Accepted Answer · answered Nov 13 '17 at 13:21

There are some flaws with your benchmark, but in this case, the tendency still can be acknowledged. The problem with your code is that you are implementing the raw type Comparable, but the HashMap implementation verifies the type compatibility before deciding to use it for resolving hash collisions.

So in your case, the natural order is never used within the HashMap implementation, but your class implementing the raw type Comparable is causing a more expensive check. Have a look at HashMap.comparableClassFor(Object x):

static Class<?> comparableClassFor(Object x) {
    if (x instanceof Comparable) {
        Class<?> c; Type[] ts, as; Type t; ParameterizedType p;
        if ((c = x.getClass()) == String.class) // bypass checks
            return c;
        if ((ts = c.getGenericInterfaces()) != null) {
            for (int i = 0; i < ts.length; ++i) {
                if (((t = ts[i]) instanceof ParameterizedType) &&
                    ((p = (ParameterizedType)t).getRawType() ==
                     Comparable.class) &&
                    (as = p.getActualTypeArguments()) != null &&
                    as.length == 1 && as[0] == c) // type arg is c
                    return c;
            }
        }
    }
    return null;
}

For your key class not implementing Comparable, the test is over right at x instanceof Comparable. For the other class, this rather cheap instanceof test evaluates to true and the much more complicated test for the generic type signature is made which will then fail. And the result of this test is not remembered, but repeated not only for every key:

When you look at HashMap.find:

final TreeNode<K,V> find(int h, Object k, Class<?> kc) {
    TreeNode<K,V> p = this;
    do {
        int ph, dir; K pk;
        TreeNode<K,V> pl = p.left, pr = p.right, q;
        if ((ph = p.hash) > h)
            p = pl;
        else if (ph < h)
            p = pr;
        else if ((pk = p.key) == k || (k != null && k.equals(pk)))
            return p;
        else if (pl == null)
            p = pr;
        else if (pr == null)
            p = pl;
        else if ((kc != null ||
                  (kc = comparableClassFor(k)) != null) &&
                 (dir = compareComparables(kc, k, pk)) != 0)
            p = (dir < 0) ? pl : pr;
        else if ((q = pr.find(h, k, kc)) != null)
            return q;
        else
            p = pl;
    } while (p != null);
    return null;
}

you will see that after the test comparableClassFor failed, find is invoked recursively. It tries to remember the result of the test using kc and passing it down, but in the failed case it’s null and thus treated like not being made yet, in other words, the test will be repeated whenever descending into a subtree, but since the natural order is not used, this code might have to loop over the other subtree as well if the key has not been found in the first, also repeating the test in every iteration.

Or, in other words, this comparableClassFor(k) might be repeated for every element in that bucket in the worst case. And JVM optimizations might even change the result in favor of the class not implementing Comparable, as the JVM might recognize the repeated identical instanceof tests on the same key object and optimize them, whereas the test for the generic type signature is not an intrinsic JVM operation and its outcome less likely to get predicted.

Of course, the result changes dramatically when your MyComparableClass correctly implements Comparable<MyComparableClass>. Then, the natural order is used to change the time complexity from O(n) to O(log n), but also the test is only done once for each lookup, as the then-non-null result will be remembered in kc.

I believe it's pretty bad. It shouldn't reduce productivity when you poorly implement interface `Comparable`. — sann05, Nov 13 '17 at 13:34
@sann05: well it’s a case of poorly implementing `Comparable` *and* poorly implementing `hashCode()` within the same class. Still, I agree that remembering the result of this test is worth a consideration, especially, as there is [`ClassValue`](https://docs.oracle.com/javase/8/docs/api/?java/lang/ClassValue.html) allowing to implement such a cache easily (already taking care about efficiency, thread safety and still allowing the classes to get garbage collected). — Holger, Nov 13 '17 at 13:39

DodgyCodeException · Answer 2 · 2017-11-13T12:22:12.070

It still uses a binary tree even if your class is not comparable! As explained in https://stackoverflow.com/a/30180593/638028, it uses lower-level ways to compare two objects (such as System.identityHashCode) and your results are probably a combination of:

Not enough iterations, thereby not comparing JIT-compiled versions. Try reversing the timings (do non-comparable first, and then comparable) and see if you get different results. Use JMH to do proper benchmarking.
Calling your own compareTo method is not as fast as the methods HashMap uses internally for comparing non-comparables.
You are also timing the performance of the memory allocator, since you are creating a new class to pass to HashMap.get in each iteration (though the JVM will eventually perform escape analysis when the JIT compiler kicks in, and will turn this into a stack allocation).

Time for getting by comparable key from HashMap

2 Answers2