11

I know that HashMap does not guarantee the order. Consider the following code:

import java.util.HashMap;
import java.util.Map;

public class SandBox {
    protected static class Book {
        String name;

        public Book(String name) {
            this.name = name;
        }

        @Override
        public String toString() {
            return name;
        }
    }

    protected static class MyThread extends Thread {
        @Override
        public void run() {
            super.run();
            final int n = 10;
            Book[] books = new Book[n];
            for (int i=0; i<n; i++)
                books[i] = new Book("b" + i);
            for (Book b : books)
                System.out.print(b + ", ");
            System.out.println();
            HashMap<Book, Object> hm = new HashMap<>();
            for (Book b : books)
                hm.put(b, null);
            for (Map.Entry<Book, Object> entry : hm.entrySet())
                System.out.print(entry.getKey() + ", ");
            System.out.println();
        }
    }

    public static void main(String[] args) throws InterruptedException {
        MyThread t = new MyThread();
        t.start();
        t.join();
    }
}

In each run, the order of HashMap is different (as expected). For example:

Output #1:

b0, b1, b2, b3, b4, b5, b6, b7, b8, b9, 
b3, b4, b7, b9, b0, b8, b1, b2, b6, b5,

Output #2:

b0, b1, b2, b3, b4, b5, b6, b7, b8, b9, 
b9, b4, b3, b7, b8, b0, b1, b5, b6, b2,

But the strange thing is that if I replace the lines

t.start();
t.join();

with

t.run();

(not using multithreading) the output is always the same:

b0, b1, b2, b3, b4, b5, b6, b7, b8, b9, 
b0, b3, b7, b4, b2, b6, b9, b1, b5, b8, 

I don't understand the relationship between HashMap's order and Thread. Can someone please explain to me why is this happening?

Christophe Roussy
  • 16,299
  • 4
  • 85
  • 85
  • Duplicate ? https://stackoverflow.com/questions/4418896/what-causes-the-slightly-unpredictable-ordering-of-the-iterator-for-the-java-u – Christophe Roussy Mar 27 '18 at 07:34
  • @ChristopheRoussy I don't think it is duplicate; but it is related. – Mehran Mirkhan Mar 27 '18 at 07:41
  • You can make the output more interesting by changing your toString to `return name + " " + hashCode() % 10_000;` (using module to make the output fit on the screen) - you can see that the hashCode doesn't change between single-threaded runs and it does in a thread. As to why? *shrug* using a single thread is more deterministic. But you shouldn't depend on it in any case. – Erwin Bolwidt Mar 27 '18 at 07:44

2 Answers2

14

It's because HashMap order internally will depend on hashcode implementation.

Your Book class does not implement hashCode so it will use the default one

As much as is reasonably practical, the hashCode method defined by class Object does return distinct integers for distinct objects. (This is typically implemented by converting the internal address of the object into an integer, but this implementation technique is not required by the JavaTM programming language.)

That means it will use memory address.

In your case, it happens that for a single thread, allocated memory addresses are the same on re-run, which is not the case in threaded version.

But this is only 'by accident' and you cannot rely on it even in single threaded (someone else will run it and get a different result, or even when you run it later you can get a different result as objects will have different memory addresses)

Please ALWAYS overwrite hashCode (&equals) when using objects in hashmap.

tung
  • 719
  • 2
  • 14
  • 30
Maciej
  • 1,954
  • 10
  • 14
  • 2
    Some IDEs like IntelliJ IDEA will generate the hashCode method for you, just hit alt-insert and pick hashcode. – Charlie Mar 27 '18 at 07:47
  • 5
    *"Please ALWAYS overwrite hashCode (&equals) when using objects in hashmap."* - This is not good advice. Object::hashCode should **NOT** be overridden if the object requires equals by identity semantics. – Stephen C Mar 27 '18 at 07:53
  • I agree you might need to want to have always unique value from hash code due to some business requirements, BUT *even in this case you should overwrite it* (well you can return some UUID or even super.hashCode) but it will tell other maintainers of your code that you really MEAN it - not that you forgot to overwrite it (they might try to 'correct' your 'mistake'). – Maciej Mar 27 '18 at 07:57
  • 4
    @StephenC Yes, but if you want to rely on identity semantics in a HashMap, you should normally use a `java.util.IdentityHashMap`, not force the object to not override hashCode. – Erwin Bolwidt Mar 27 '18 at 07:59
  • Agreed. But that doesn't make the advice in the question correct. "Usually" I would agree with. "ALWAYS" .... no. – Stephen C Mar 27 '18 at 08:16
  • 1
    @StephenC I disagree - please see 2 comments above - from maintenance point of view and code readability I would still overwrite it to make it clear that it is correct behavior for others. – Maciej Mar 27 '18 at 08:22
  • 2
    _That means it will use memory address._ It's not true by default for HotSpot. It uses random number generator for generating identity hashCode and stores it in object header. – turbanoff Mar 27 '18 at 13:16
-1

Not really an answer but as a complement: here is the code used for the inserts (put) in the HashMap code, TreeNode code:

/**
 * Tie-breaking utility for ordering insertions when equal
 * hashCodes and non-comparable. We don't require a total
 * order, just a consistent insertion rule to maintain
 * equivalence across rebalancings. Tie-breaking further than
 * necessary simplifies testing a bit.
 */
static int tieBreakOrder(Object a, Object b) {
    int d;
    if (a == null || b == null ||
        (d = a.getClass().getName().
         compareTo(b.getClass().getName())) == 0)
        d = (System.identityHashCode(a) <= System.identityHashCode(b) ?
             -1 : 1);
    return d;
}

As you can see it relies on System.identityHashCode which is native.

And in System:

/**
 * Returns the same hash code for the given object as
 * would be returned by the default method hashCode(),
 * whether or not the given object's class overrides
 * hashCode().
 * The hash code for the null reference is zero.
 *
 * @param x object for which the hashCode is to be calculated
 * @return  the hashCode
 * @since   JDK1.1
 */
public static native int identityHashCode(Object x);

Also see this answer: How does the JVM ensure that System.identityHashCode() will never change?

Christophe Roussy
  • 16,299
  • 4
  • 85
  • 85