Java - Hashmap retreival sequence

Question

import java.util.HashMap;
import java.util.Map.Entry;
public class TestString {
    public static void main(String[] args) {
        System.gc();

        String str = "deepak";
        int length = str.length();
        System.out.println("str " + str + " length " + length);

        HashMap<Character,Integer> map = new HashMap<Character,Integer>();
        for(int i=0; i<length; i++){
            char ch = str.charAt(i);
            if(map.containsKey(ch)){
                Integer val = map.get(ch);
                map.put(ch, val+1);
            }else{
                map.put(ch, 1);
            }
        }

        for (Entry<Character, Integer> entry : map.entrySet())
        {
            int hashCode = entry.hashCode();
            char key = entry.getKey();
           // int hash = hash();
            System.out.println("hashcode  " + hashCode + " hashcode of key>> " + entry.getKey().hashCode() + " key : " + key);
        }
        System.out.println(">>> " + map);
    }
}

Output :
str deepak length 6

hashcode 113 hashcode of key>> 112 key : p

hashcode 96 hashcode of key>> 97 key : a

hashcode 101 hashcode of key>> 100 key : d

hashcode 103 hashcode of key>> 101 key : e

hashcode 106 hashcode of key>> 107 key : k

>>> {p=1, a=1, d=1, e=2, k=1}

Can anyone help me to understand the 2 things from the program and output:

The data printed by map object, how it decide the sequence internally? eg. it is printing sequence p, a, d, e, k.
What is the difference in entry.hashcode() and entry.key().hashcode()? Please refer the output to explain the difference.

The internal order of your Entries is not guaranteed by a HashMap. They can be in any oder. Have a look [here](http://stackoverflow.com/questions/683518/java-class-that-implements-map-and-keeps-insertion-order). — GAlexMES, Mar 09 '17 at 10:16
The difference is explained in the [Java API](https://docs.oracle.com/javase/7/docs/api/java/util/Map.Entry.html#hashCode()) — GAlexMES, Mar 09 '17 at 10:17
What’s the point of calling `System.gc()` at the beginning of the program? — Holger, Mar 10 '17 at 13:56

score 0 · Answer 1 · edited May 23 '17 at 12:32

0

As far as I know, a HashMap is not garantueing the order of the Entrys or the keys. If you want to have them ordered you will need a TreeMap or linked map. See here.. A HashMap simply has no order.
I already commented you the link to the Java API, which explains it pretty well. But I will explain it reffering to your output:

hashcode 113 hashcode of key>> 112 key : p

That means: The entry HashCode is 113. The HashCode of your Key is 112. You don't print the HashCode of your Entry, which is 1. The HashCode of the Entry is calculated with the HashCode of the Key and the HashCode of the Value. Those to HashCodes will be xored.

edited May 23 '17 at 12:32

Community

1
1

answered Mar 09 '17 at 10:33

GAlexMES

335
2
20

I agree there is no order guaranteed by HashMap but I have tried multiple string data multiple times but it always respond in the same sequence for the respective String data so it means it is having some internal pattern to print the output. – Somesh Gupta Mar 09 '17 at 11:59
No it has not. If you try the same thing on different machines, or different OS, different processor architecture or just try it often enough. Then there will be a different order. – GAlexMES Mar 09 '17 at 12:01

score 0 · Answer 2 · answered Mar 10 '17 at 14:20

The iteration order is unspecified and technically, it’s a function over the key hashcodes, the actual map implementation, the map’s capacity and in some cases, the history of the particular map instance.

You can easily show the dependency to the map’s capacity by changing your code to

String str = "deepak";
int length = str.length();
System.out.println("str " + str + " length " + length);

HashMap<Character,Integer> map = new HashMap<>(100);
for(int i=0; i<length; i++) {
    char ch = str.charAt(i);
    map.merge(ch, 1, Integer::sum);
}

for(Entry<Character, Integer> entry: map.entrySet()) {
    int hashCode = entry.hashCode();
    Character key = entry.getKey();
    System.out.printf("hashcode %3d, hashcode of key %3d, key : %s%n",
                      hashCode, key.hashCode(), key);
}
map = new HashMap<>(map);
System.out.println(">>> " + map);

in my environment, it prints

str deepak length 6
hashcode  96, hashcode of key  97, key : a
hashcode 101, hashcode of key 100, key : d
hashcode 103, hashcode of key 101, key : e
hashcode 106, hashcode of key 107, key : k
hashcode 113, hashcode of key 112, key : p
>>> {p=1, a=1, k=1, d=1, e=2}

showing how the maps with different capacity exhibit different iteration orders.

The key and the Map.Entry instance are a different objects, hence, it’s not surprising that they have different hash codes. The hash code of Map.Entry is well specified

The hash code of a map entry e is defined to be:
(e.getKey()==null   ? 0 : e.getKey().hashCode()) ^
(e.getValue()==null ? 0 : e.getValue().hashCode())
This ensures that e1.equals(e2) implies that e1.hashCode()==e2.hashCode() for any two Entries e1 and e2, as required by the general contract of Object.hashCode.

This is in line with the intended behavior of the Map views. If you have two Maps, a and b, then

a.keySet().equals(b.keySet()) will be true iff both maps have the same keys.
a.entrySet().equals(b.entrySet()) will be true iff both maps have the same keys and each key maps to the same value for each map, in other words, both maps are equal.

This is even specified in Map.equals:

Returns true if the given object is also a map and the two maps represent the same mappings. More formally, two maps m1 and m2 represent the same mappings if m1.entrySet().equals(m2.entrySet()).
Since keys and entries are different things, a.keySet().equals(a.entrySet()) can only be true, if a is empty.

Sure! Why not downvote one of the most respectful users on so many tags without a comment. This is a good (and detailed!) answer... one plus — Eugene, Mar 11 '17 at 01:40

Eugene · Answer 3 · 2017-03-13T09:20:49.200

It's a bit unclear what you are trying to prove here or ask, but how about a much simpler example:

 HashMap<String, Integer> map = new HashMap<>();
    map.put("HZV", 1);
    map.put("CJX", 2);

    System.out.println(map); // {CJX=2, HZV=1}

    for (int i = 0; i < 16; ++i) {
        map.put("" + i, i);
    }

    for (int i = 0; i < 16; ++i) {
        map.remove("" + i);
    }

    System.out.println(map); // {HZV=1, CJX=2}

Notice how both prints in the example above store the same contents, the entry in the map are the same; the only difference is that they are in different order.

This is because the number of buckets in the internal map storage has increased and because of that entries were re-hashed and they moved to other buckets. This is obvious that there is no guaranteed sequence when retrieving entries from a Map.

And other answers here, have pointed that out, that the specification of the Map says that also.

This is even better seen in jdk-9 immutable maps:

Map<String, Integer> map2 = Map.of("1", 1, "2", 2, "3", 3);
System.out.println(map2);

The output of this map2 will vary from run to run! It perfectly legal for an output like this (running twice in different VMs):

{1=1, 3=3, 2=2}

{1=1, 2=2, 3=3}

This is exactly because a Map does not have an order, so they implemented a randomization pattern inside the Map.

EDIT

A little bit of explanation.

I'll try, but this is a big subject overall. So a HashMap will use buckets to store it's key/value pairs. Buckets are actually an array of Nodes (Red/Black TreeNode or LinkedNode) depending on size. Entries go to certain bucket using the hashCode of the key and modulo. Well not module directly, but a trick using & operator (search power of two hash vs modulo). The re-size (actually the doubling on the internal array - thus triggers a re-hash) is done under some conditions, the most obvious one is when the load_factor is reached(but that's not the only one). So if you declare: new HashMap(16), the load factor is 0.75, so when you add the 13 entry, the internal array will re-size to 32 entries. This will also trigger a re-hash of the keys (under the hood there's one more bit that is taken into consideration when choosing the bucket - so entires might move - like in the example I provided). This is NOT JVM dependent, but implementation dependent. This is the behavior at the moment in jdk-8 and jdk-9. But either way you should not care, but instead rely on the general contracts that the Map offers.

Thanks for explaining in detail. I know in hashmap resizing happens with 75% but I am not sure how it actually works for some specific program or machine. Can you please explain how resizing take place in memory for hashmap and on what point it actually trigger the resizing? Is resizing is soemthing specific to program or it is dependent on JVM? — Somesh Gupta, Mar 13 '17 at 09:01
@SomeshGupta I've done an edit to add *some* details. But generally this is a big subject that needs to be separated; it is not easy to understand at once. I've tried to tackle that some time ago and here is where I posted my findings.. https://www.slideshare.net/secret/8k8EP71cNydeib — Eugene, Mar 13 '17 at 09:21
Thanks @Eugene. But when you say this line - "The re-size (actually the doubling on the internal array - thus triggers a re-hash) is done under some conditions", so you mean internal array List (Map.Entry) or you are talking about something else? — Somesh Gupta, Mar 13 '17 at 09:29
@SomeshGupta no, I mean this: `Node[] table;` - an array of `Node`s (key/value class). This is what gets re-sized. You can look inside HashMap code and see what this `Node` is - it's a *holder* for the key/value — Eugene, Mar 13 '17 at 09:31

Java - Hashmap retreival sequence

3 Answers3