6

I know that in Java 8 HashMap was optimized for poorly distributed hashCode. And in cases when threshold was exceeded it rebuilds nodes in bucket from linked list into tree. Also it is stated that this optimization doesn't work for not comparable keys (at leas performance is not improved). In the example below I put not Comparable keys into HashMap

import java.util.HashMap;
import java.util.Map;
import java.util.concurrent.TimeUnit;
import java.util.stream.IntStream;

class Main {
    public static void main(String[] args) throws InterruptedException {
        Map<Key, Integer> map = new HashMap<>();

        IntStream.range(0, 15)
                .forEach(i -> map.put(new Key(i), i));

        // hangs the application to take a Heap Dump
        TimeUnit.DAYS.sleep(1);
    }
}

final class Key {
    private final int i;

    public Key(int i) {
        this.i = i;
    }

    @Override
    public boolean equals(Object o) {
        if (this == o) return true;
        if (o == null || getClass() != o.getClass()) return false;
        Key key = (Key) o;
        return i == key.i;
    }

    @Override
    public int hashCode() {
        return 1;
    }
}

But Inspecting the Heap Dump shows that nodes was rearrange into Tree.

enter image description here

My question is why nodes is rebuilt into the tree if it will not improve performance and on which criteria in this case nodes is compared to figure out which key should be right node and which left?

Artem Petrov
  • 772
  • 4
  • 17

1 Answers1

8

I think that you sort of misunderstood what that answer was saying. Comparable is not needed, it's just an optimization that might be used when hashes are equal - in order to decide where to move the entry - to the left or to the right (perfectly balanced red-black tree node). Later if keys are not comparable, it will use System.identityHashcode.

figure out which key should be right node and which left

It goes to the right - the bigger key goes to the right, but then the tree might need to be balanced. Generally you can look-up the exact algorithm of how a Tree becomes a perfectly balanced red black tree, like here

Eugene
  • 117,005
  • 15
  • 201
  • 306
  • 1
    Afaik it doesn't uses the identity hash code, just the hash code of the key object. – Jorn Vernee Jul 15 '17 at 11:16
  • 1
    @JornVernee It *does*. See `tieBreakOrder` method inside `HashMap` – Eugene Jul 15 '17 at 11:18
  • 3
    @Eugene That's a fallback for when the hash codes are equal (which is not necessarily the case with a hash collision, though I guess it is in this example). If you look at where it's used in `putTreeVal`, you will see it tries to use the key's hash code before that. – Jorn Vernee Jul 15 '17 at 11:30
  • 1
    I'm sorry I misunderstand you. @Eugene is right. I have forgot the `tieBreakOrder`. it calculates the `dir` by its own hash first and then using `Comparable` and last using `System.identityHashCode` which is a global unique hash code. – holi-java Jul 15 '17 at 11:34
  • @Eugene Could you please provide some explanation how searching is working in case of building binary tree with help of `System.identityHashCode`. It looks like if we call `map.get(new Key(1))` key will have different identity hash code than key which was used on `map.put(new Key(1), 1)` and searching by other identity hash code we might not to find the value (even though key has same hashCode and equals) – Artem Petrov Jul 15 '17 at 11:57
  • 1
    @JornVernee well yes, that is not the default, it's only used when hashcodes are the same indeed (your first comment) – Eugene Jul 15 '17 at 12:01
  • 3
    @ArtemPetrov that's a great point! and here is the explanation for it: https://yermilov.github.io/blog/2017/02/24/tiebreaker-regarding-java-hashmap-treenode-and-tiebreakorder/ – Eugene Jul 16 '17 at 11:44