10

quick question to be sure I understood well how HashMap in java works.

Here is an example of code:

//String key = new String("key");
//String val = new String("value");
String key = "key";
String val = "value";

HashMap map = new HashMap();
map.put(key, val);

System.out.println("hashmap object created. Its key hashcode = "+key.hashCode());
// the hashcode is 106079
System.out.println("hashmap object value for key = "+map.get(key));

// Let's store using a key with same hashcode
Integer intkey = new Integer(106079);
val = "value2";
map.put(intkey, val);
System.out.println("hashmap object created. Its intkey hashcode = "+intkey.hashCode());
// this returns 106079 once again. So both key and intkey have the same hashcode

// Let's get the values
System.out.println("hashmap object value for intkey = "+map.get(intkey));
System.out.println("hashmap object value for key = "+map.get(key));

the returned values are as expected. I read that behind the scene, HashMap works as follows:

  1. get the key/value.
  2. make a hashcode from the key.
  3. store in bucket the key and value objects (in my case bucket number 106079)

same again for the second one.

  1. stores it within the same bucket but as this is at this point a LinkedList, I suppose store it at "next available allocation".

To get it:

  1. pick up the key, get hashcode,
  2. look at the bucket,
  3. then look at the key from the first element in the LinkedList,
  4. then check if key passed and key from element match, if not then continue to the next, and so on until value can be retrieved.

Am I understanding this concept correctly?

Many thanks!

EDIT:

many thanks, so to complete: - How does Java HashMap store entries internally - How does a Java HashMap handle different objects with the same hash code?

And:

  • There shouldn't be that many "buckets"
  • A bucket is not the same as an entry. A bucket is a logical collection of all entries sharing the same bucket# (which in the Java case is a function of the key's hashCode()). As stated in other answers, bucket overflow could be implemented several ways, not necessarily in a List.
  • There are other existing collision resolutions: http://en.wikipedia.org/wiki/Hash_table#Collision_resolution
  • it uses Object's equals method to compare and retrieve objects from the same bucket.
Community
  • 1
  • 1
user2929613
  • 405
  • 1
  • 5
  • 14
  • Yup, that's the way it happens. Of course you probably wouldn't have that many buckets actually. – Kayaman Oct 28 '13 at 20:42
  • 1
    You have summarized it pretty much. Just for the completion sake, it uses Object's equals method to compare and retrieve objects from the same bucket. – csn Oct 28 '13 at 20:42
  • http://stackoverflow.com/questions/6493605/how-does-a-hashmap-work-in-java – munyengm Oct 28 '13 at 20:44
  • 2
    BTW, use String x = "something", not String x = new String("something"). Since Java Strings are immutable, the latter is just a waste of space and time. – MrBackend Oct 28 '13 at 20:59

3 Answers3

3

Please also note, there are several ways HashMap can implement hash codes collision resolution, not only utilizing linked list as you mentioned

Java's HashMap implementation does not only use LinkedList strategy to handle key-values with same key.hashCode() values.

Also, you may want to read this article

kiruwka
  • 9,250
  • 4
  • 30
  • 41
1

Yes, your understanding is correct. Just note that a single bucket is assigned many hashcodes: in a fresh HashMap there are altogether 16 buckets, each assigned a total of 232/16 = 228 hashcodes.

Marko Topolnik
  • 195,646
  • 29
  • 319
  • 436
  • Thanks. But then what is this concept of bucket? So you have a small amount of bucket that stores a lot of hashcode references. How does it know which bucket to look at at the first place then? Does it mean it kind of look at 1st the bucket number then the hashcode? Also, someone mentioned the equals(), which is part of the equals/hashcode contract (still not sure what this means), so a key object will use its equals() method to check against the key stored in the bucket at the hashcode reference if more than one storing is happening at that particular hashcode reference? – user2929613 Oct 28 '13 at 20:51
  • 2
    A bucket is not the same as an entry. A bucket is a logical collection of all entries sharing the same bucket# (which in the Java case is a function of the key's hashCode()). As stated in other answers, bucket overflow could be implemented several ways, not necessarily in a List. – MrBackend Oct 28 '13 at 20:57
  • Single bucket may be assigned with multiple hash codes ? Nope, Multiple hashcodes may be assigned to single bucket. – Sundar Rajan May 20 '19 at 01:55
  • @SundarRajan It seems you stated the exact same thing twice, but your second sentence claims it's the opposite of the first one. – Marko Topolnik May 20 '19 at 07:28
0

Your understandings are OK, but take into account that there are several implementations. The actual hashcode HashMap uses to store the value may not be 106079. Here's one implementation (java-6-openjdk):

public V put(K key, V value) {
    if (key == null)
        return putForNullKey(value);
    int hash = hash(key.hashCode());
    int i = indexFor(hash, table.length);
    for (Entry<K,V> e = table[i]; e != null; e = e.next) {
        Object k;
        if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {
            V oldValue = e.value;
            e.value = value;
            e.recordAccess(this);
            return oldValue;
        }
    }

    modCount++;
    addEntry(hash, key, value, i);
    return null;
}

Notice the hash method, which consists of the following:

/**
 * Applies a supplemental hash function to a given hashCode, which
 * defends against poor quality hash functions.  This is critical
 * because HashMap uses power-of-two length hash tables, that
 * otherwise encounter collisions for hashCodes that do not differ
 * in lower bits. Note: Null keys always map to hash 0, thus index 0.
 */
static int hash(int h) {
    // This function ensures that hashCodes that differ only by
    // constant multiples at each bit position have a bounded
    // number of collisions (approximately 8 at default load factor).
    h ^= (h >>> 20) ^ (h >>> 12);
    return h ^ (h >>> 7) ^ (h >>> 4);
}

So in this JVM's example, it does not use 106079 as a hash, since HashMap recreates a hash to 'harden' it.

I hope that helps

FedeCz
  • 533
  • 3
  • 4
  • Thanks. But then what is this concept of bucket? So you have a small amount of bucket that stores a lot of hashcode references. How does it know which bucket to look at at the first place then? Does it mean it kind of look at 1st the bucket number then the hashcode? Also, someone mentioned the equals(), which is part of the equals/hashcode contract (still not sure what this means), so a key object will use its equals() method to check against the key stored in the bucket at the hashcode reference if more than one storing is happening at that particular hashcode reference? – user2929613 Oct 28 '13 at 20:52
  • with Add and Get it is pretty much clear. what about Remove method? how does it implemented? –  May 27 '23 at 15:18