Why do we need LinkedHashMap if keySet() maintains order for a HashMap?

Question

public class HashMapKeySet {

public static void main(String[] args) {
    Map<HashCodeSame,Boolean> map=new HashMap();

    map.put(new HashCodeSame(10),true);
    map.put(new HashCodeSame(2),false);

    for(HashCodeSame i:map.keySet())
        System.out.println("Key: "+i+"\t Key Value: "+i.getA()+"\t Value: "+map.get(i)+"\t Hashcode: "+i
                .hashCode());

    System.out.println("\nEntry Set******");
    for(Map.Entry<HashCodeSame, Boolean> i:map.entrySet())
        System.out.println("Key: "+i.getKey().getA()+"\t Value: "+i.getValue()+"\t Hashcode: "+i.hashCode());

    System.out.println("\nValues******");
    for(Boolean i:map.values())
        System.out.println("Key: "+i+"\t Value: "+map.get(i)+"\t Hashcode: "+i.hashCode());

}

static class HashCodeSame{

    private int a;

    public int getA() {
        return a;
    }

    public void setA(int a) {
        this.a = a;
    }

    HashCodeSame(int a){
        this.a=a;
    }

    @Override
    public boolean equals(Object o) {
        if (this == o) return true;
        if (o == null || getClass() != o.getClass()) return false;

        HashCodeSame that = (HashCodeSame) o;

        return a == that.a;

    }

    @Override
    public int hashCode() {
        return 1;
    }
}

}

If you could see in above example, I have explicitly made the hashcode() return 1 in all cases, to check what happens, when there is a collision of key.hashcode() in hashmap. What happens, a linked list is maintained for these Map.Entry objects such as

1(key.hashcode()) will link to <2,false> will link to <10,true>

(as false value is entered after true value, as I understand).

But when I do keySet(), true is returned first and then false, instead of false returning first.

So , what I am assuming here, since keySet() is a set and set maintains order, we get true and false while iteration. But, then again, why don’t we say hashmap maintains order, as the only way to retrieve is by order. Or Why do we use LinkedHashMap?

 Key: DS.HashMapKeySet$HashCodeSame@1    Key Value: 10   Value: true     Hashcode: 1
Key: DS.HashMapKeySet$HashCodeSame@1     Key Value: 2    Value: false    Hashcode: 1

Entry Set******
Key: 10  Value: true     Hashcode: 1230
Key: 2   Value: false    Hashcode: 1236

Values******
Key: true    Value: null     Hashcode: 1231
Key: false   Value: null     Hashcode: 1237

Now , when I add chsnge the hashcode method to return a like

@Override
    public int hashCode() {
        return a;
    }

I get reverse order. Plus on addition of

    map.put(new HashCodeSame(10),true);
    map.put(new HashCodeSame(2),false);
    map.put(new HashCodeSame(7),false);
    map.put(new HashCodeSame(3),true);
    map.put(new HashCodeSame(9),true);

output received is,

    Key: DS.HashMapKeySet$HashCodeSame@2     Key Value: 2    Value: false    Hashcode: 2
Key: DS.HashMapKeySet$HashCodeSame@3     Key Value: 3    Value: false    Hashcode: 3
Key: DS.HashMapKeySet$HashCodeSame@7     Key Value: 7    Value: false    Hashcode: 7
Key: DS.HashMapKeySet$HashCodeSame@9     Key Value: 9    Value: true     Hashcode: 9
Key: DS.HashMapKeySet$HashCodeSame@a     Key Value: 10   Value: true     Hashcode: 10

Entry Set******
Key: 2   Value: false    Hashcode: 1239
Key: 3   Value: false    Hashcode: 1238
Key: 7   Value: false    Hashcode: 1234
Key: 9   Value: true     Hashcode: 1222
Key: 10  Value: true     Hashcode: 1221

Values******
Key: false   Value: null     Hashcode: 1237
Key: false   Value: null     Hashcode: 1237
Key: false   Value: null     Hashcode: 1237
Key: true    Value: null     Hashcode: 1231
Key: true    Value: null     Hashcode: 1231

Now it again makes me wonder, why is the order coming in sorted way.? Can anyone explain me in detail how does keySet(), entrySet() methods of hashmap works?

That is because items added with the same hashcode all end up in the same bucket, and the insertion order is preserved, that is not the case if you have distributed hashcodes. Using the same hashcode for all objects is a bad idea. — Mark Rotteveel, Jun 10 '16 at 09:15
"Why do we need LinkedHashMap if keySet() maintains order for a HashMap?" The ordering of hash map keys is undefined; if you are seeing them come out in an order you expect, that is coincidence, and not guaranteed always to be the case. — Andy Turner, Jun 10 '16 at 09:18
Can you make me understand the internal implementation of keySet()? As in this http://stackoverflow.com/questions/1882762/is-the-java-hashmap-keyset-iteration-order-consistent link it is given, keySet is always in same order as entered, although iterating over it would be costlier than iterating over linkedHashMap. — dgupta3091, Jun 10 '16 at 09:27
It is trivial to construct an example where iteration order != insertion order, e.g. http://ideone.com/SOe3Qh. *You don't need to understand the internal implementation*. All you need to know is there is no guarantee of iteration order for `HashMap`. — Andy Turner, Jun 10 '16 at 09:37
http://ideone.com/pLQuUU if you go through this link, you;ll see the order is preseved. @AndyTurner — dgupta3091, Jun 10 '16 at 09:54
It doesn't matter if you can construct an example where the ordering happens to be preserved: there are no ordering guarantees for `HashMap`, so that is mere coincidence: you cannot rely on the ordering being preserved. — Andy Turner, Jun 10 '16 at 09:57

score 6 · Accepted Answer · answered Jun 10 '16 at 20:33

HashMap does not have a defined iteration order, and LinkedHashMap does have a specified iteration order.

The difficulty with HashMap is that it's easy to construct simple examples where the iteration order is quite predictable and is fairly stable, even though this is not guaranteed.

Suppose for example that you did this:

    Map<String, Boolean> map = new HashMap<>();
    String str = "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
    for (int i = 0; i < str.length(); i++) {
        map.put(str.substring(i, i+1), true);
    }
    System.out.println(map.keySet());

The result is

[A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U, V, W, X, Y, Z]

Hey! Those are in order! Well, the reason is that String's hashCode() function is pretty lousy, and it's exceptionally lousy for single-character strings. Here is String's hashCode() specification. Essentially it's a sum-and-multiply, but for a single-character string it's just the Unicode value of that char. So the hashcodes of the single-character strings above are 65, 66, ... 90. The internal table of HashMap is always a power of two, and in this case it's 64 entries long. The table entry used is the key's hashCode() value right-shifted by 16 bits and XORed with itself, modulo the table size. (See the code here.) Thus, these single-character strings end up in sequential buckets in the HashMap table, in array positions 1, 2, ... 26.

Key iteration proceeds sequentially through the buckets, so the keys end up coming out in the same order they were put in. Again, this isn't guaranteed, it just so happens to work this way because of the characteristics of the various pieces of implementation as described above.

Now consider HashCodeSame where the hashCode() function returns 1 every time. Adding a few of these objects to a HashMap will cause them all to end up in the same bucket, and since iteration traverses sequentially down the linked list, they'll come out in order:

    Map<HashCodeSame, Boolean> map = new HashMap<>();
    for (int i = 0; i < 8; i++) {
        map.put(new HashCodeSame(i), true);
    }
    System.out.println(map.keySet());

(I've added a toString() method that does the obvious thing.) The result is:

[HCS(0), HCS(1), HCS(2), HCS(3), HCS(4), HCS(5), HCS(6), HCS(7)]

Again, the keys come out in order, because of coincidence of implementation, but for different reasons than above.

But wait! In JDK 8, HashMap will convert a bucket from a linear linked list to a balanced tree if too many entries appear in the same bucket. This happens if more then 8 entries end up in the same bucket. Let's try that:

    Map<HashCodeSame, Boolean> map = new HashMap<>();
    for (int i = 0; i < 20; i++) {
        map.put(new HashCodeSame(i), true);
    }
    System.out.println(map.keySet());

The result is:

[HCS(5), HCS(0), HCS(1), HCS(2), HCS(3), HCS(4), HCS(6),
HCS(18), HCS(7), HCS(11), HCS(16), HCS(17), HCS(15), HCS(13),
HCS(14), HCS(8), HCS(12), HCS(9), HCS(10), HCS(19)]

The bottom line is that HashMap does not maintain a defined iteration order. If you want a specific iteration order, you must use LinkedHashMap or a sorted map such as TreeMap. Unfortunately, HashMap has an iteration order that is fairly stable and predictable, in fact, just predictable enough to lull people into thinking that its order is well-defined, when in fact it isn't.

To help combat this problem, in JDK 9, new hash-based collections implementations will randomize their iteration order from run to run. For example:

    Set<String> set = Set.of("A", "B", "C", "D", "E",
                             "F", "G", "H", "I", "J");
    System.out.println(set);

This printed out the following when run in different invocations of the JVM:

[I, H, J, A, C, B, E, D, G, F]
[C, B, A, G, F, E, D, J, I, H]
[A, B, C, H, I, J, D, E, F, G]

(The iteration order is stable within a single run of the JVM. Also, existing collections such as HashMap do not have their iteration order randomized.)

I do not agree that `String.hashCode` is lousy, even when narrowing the view at single character `String`s. All single character `String`s have a distinct hash code, so it’s unclear what you would expect more from it. Performing arbitrary transformations of that value, in the hope that a particular hash map implementation gets a benefit? Since it is the design decision of that particular hash map implementation, to use powers of two as size, it’s also the task of that hash map implementation to perform appropriate transformations, especially considering, how often it already changed… — Holger, Jun 20 '16 at 12:44
@Holger `String.hashCode` provides distinct hash codes, so it's ok in that respect, but it doesn't distribute the values well across the 32-bit hash code space. That's where it's lousy. Good distribution is important for things like closed hashing or if you want to split the table for parallel processing. `HashMap` has attempted to do bit mixing in various ways, but it's ineffectual for short strings, since their hash codes have so many zeroes. — Stuart Marks, Jun 23 '16 at 01:51
@Stuart Marks: I’m not sure whether good distribution across the 32 bit hash code space (beyond providing distinct values) is ever demanded by the specification. Considering the practically infinite value space of `String`s, I’m not surprised that perfect distribution for short strings is not a priority. As a side note, the alternative hashing (murmur32) tried in Java 7 produced *more* collisions for all practical cases I tested… — Holger, Jun 23 '16 at 09:24
After reading your explanation again, I got my doubts. The values `65 to 90` XOR’ed with their higher bits (which are all zeros) will again yield the values `65 to 90`, applying modulo 64 (AND 63), produces the values `1 to 26`. So they happen to end up in consecutive buckets in the same order as their natural order, which is still an artifact of the hashing algorithm and no guaranteed order, but perfect from a performance point of view. And they would be in the same buckets even without the XOR. So what’s wrong with `String.hashCode` for these `String`s? — Holger, Jun 23 '16 at 09:37
@Holger In terms of `HashMap` collisions, `String.hashCode` is fine. On other criteria it's not so good. If all the values are clumped together, this will result in unbalanced splitting if the `HashMap` is used as the source for a parallel stream. Or, if a closed hashing scheme is used -- such as for the JDK 9 immutable set and map -- clumpiness results in poor performance for linear probing. — Stuart Marks, Jul 09 '16 at 00:24

score 0 · Answer 2 · answered Jun 10 '16 at 09:43

Answer for your question in Java doc for LinkedHashMap

Hash table and linked list implementation of the Map interface, with predictable iteration order. This implementation differs from HashMap in that it maintains a doubly-linked list running through all of its entries. This linked list defines the iteration ordering, which is normally the order in which keys were inserted into the map (insertion-order). Note that insertion order is not affected if a key is re-inserted into the map. (A key k is reinserted into a map m if m.put(k, v) is invoked when m.containsKey(k) would return true immediately prior to the invocation.)

This implementation spares its clients from the unspecified, generally chaotic ordering provided by HashMap (and Hashtable), without incurring the increased cost associated with TreeMap. It can be used to produce a copy of a map that has the same order as the original, regardless of the original map's implementation:

 void foo(Map m) {
     Map copy = new LinkedHashMap(m);
     ...
 }

My problem is, while working on JDK 1.8 I can see the order preserved, no matter how many times I remove or add an element. How is that happening? — dgupta3091, Jun 10 '16 at 09:48

Why do we need LinkedHashMap if keySet() maintains order for a HashMap?

2 Answers2