19

I have read a lot of posts in the past one hour, but I am still not very clear with the concept of using immutable objects as keys in a Hashmap. I have a hashmap that has its key as a String. The value in the hashmap is MyStore, where MyStore represents information regarding stores I own. The String represents address. In my code, the logic I have is, I first look in the map for that key, if present --> get its value, if its not present put it in hashmap. My manager just told me the key will change in the future, that is address of my stores will change in the future. He said in that case, my logic of first checking if the key exists won't work. I don't understand what he means here. I want to understand the below points very clearly -

  1. Difference between mutable and immutable keys for a hashmap.
  2. What happens if you use a immutable key that can change? - I know this doesn't make sense, but I want to clearly understand what my manager is talking about here.
  3. Some posts talk about Strings if used as keys in a hashmap cache their hashcode -What does this mean?
  4. If lets say I used mutable objects as keys in my hashmap that implemented hashcode and equals, then will it work? I am assuming it will because if the key changes, the contains method will look if the key is present. If it is not present, it will put the entry so you can get it in the future.

I don't mean to create a duplicate post if this has been discussed before. If I missed reading the post that has answers to all my questions, please point me to it. If not, please explain in layman terms the above questions I have so it is useful in the future for other readers :). Feel free to edit my post's subject so in future if anyone has a similar question, they land here directly :)

rickygrimes
  • 2,637
  • 9
  • 46
  • 69
  • 1
    This should answer your question http://stackoverflow.com/questions/214714/mutable-vs-immutable-objects – Spencer Aug 03 '13 at 06:24
  • Ever changing hash-code breaks HashMap's contract. In such a case you may want to [map by object reference](http://docs.oracle.com/javase/1.5.0/docs/api/java/util/IdentityHashMap.html). Or convert the map to object-property relation. – S.D. Aug 03 '13 at 11:15

5 Answers5

29

First : how does a HashMap work?

Basically it has an array and when you put a key-value pair in the map, it is stored in one of the positions in the array. The position in the array is chosen based on the result of the key's hashCode() passed to a hashing method. Why is that? Well, if you request the value for a certain key, the index in the array to find the key and its associated value can simply be recalculated to find the index in the array again. (Some more logic is needed to deal with keys that map to the same index, but I'm just trying to get you to understand the basic mechanism) Then equals() is used to verify if the key at the calculated index is indeed the requested key.

  1. From this it should be a bit more clear why immutable keys are better than mutable keys. An immutable key will always keep the same hashCode() value, and the hashing function will find the correct bucket ( = index in the hashMap's array) again.

    That doesn't mean mutable keys cannot work. A mutable key will work if mutations on the key do not affect the hash code or if the keys are simply not mutated as long as the hashMap is used.

  2. How can an immutable key change? Well, the key itself may not be able to change, but the key-value mapping can change in business logic. If you create a map, using the address as a key, you rely on the fact that a store's address will not change. If a store's address changes, you'll not find it in the Map using its new address as a key. Your manager has a valid point.

  3. the speed of finding a key in a Map highly depends on the speed of calculating the hashCode. For a String this calculation loops over all the characters in the String. If you use long Strings as keys and have a lot of Map access this may lead to a performance bottle neck. Java's String implementation therefore caches the hash value, so it will be calculated only once. However you'll only avoid calculating the hash code if you use the same String instance again (new instances will not have the cached value). You may intern() the keys you use, but consider this only if it can be shown that there really is a performance bottle neck, as String interning does come with its own overhead.

  4. as explained in 1 : mutable keys can work if their hash code is not affected by mutations. e.g. using a Customer as key, where the hashCode() is based only on the customer's name, then a Customer implementation that only does not allow the name to change, yet allows other values to change, is a reliable key.

Community
  • 1
  • 1
bowmore
  • 10,842
  • 1
  • 35
  • 43
  • Excellent. Thanks bowmore. – rickygrimes Aug 03 '13 at 07:03
  • String's hashcode is already cached... – assylias Aug 03 '13 at 07:33
  • I'm imagining that if you change a mutable key then the HashMap should calculate it's new `hashCode()` and relocate it. I don't know what is the implementation of the HashMap in Java, but I think it would be reasonable to check if the key has been changed and if yes -> calculate the new `hashCode()` -> relocate the element and release the previous position. – Anton Belev Aug 03 '13 at 08:09
  • 6
    @AntonBelev How would a Map implementation (especially an all purpose one) be able to know when a key's hash code has changed? A Map could do a check on all contained keys prior to every get, but that would simply defeat the purpose of a Map : quickly getting the value associated with a key. – bowmore Aug 03 '13 at 08:18
2
  1. There might be a problem if you modify you mutable object used as key. map.containsKey(modifiedKey) might return false even if key is there, You'll have to iterate over keys to find it. So try to use immutable or don't modify mutable while it is a key.

  2. Immutable object never changes. There are methods that look like they're changing the object but instead a new copy is created. Example:

    String a = "A";

    String b = a.substring(0); // substring created a copy of "A" with a not being modified at all.

    a = a + b; // a+b create a new String "AA" with no modification of previous ones.

  3. This might help caching-hashes-in-java-collections also this is great why-are-immutable-objects-in-hashmaps-so-effective

  4. String has already implementation of equals and hashcode, no need to invent another class to use instead of it unless you're absolutely sure you need it.

    As mention in point 1. you can do that but you'll have to be careful and not modify your mutable objects. That's not a very good practice though.

Community
  • 1
  • 1
Tala
  • 8,888
  • 5
  • 34
  • 38
  • Thank you so much for the link - http://stackoverflow.com/questions/10342859/why-are-immutable-objects-in-hashmaps-so-effective. It answered all the questions I had. – rickygrimes Aug 03 '13 at 06:44
  • Sure I could. But I am honestly still not very clear with point #2. Also, please modify your answer to my point #4. I was referring to immutable objects. – rickygrimes Aug 03 '13 at 06:49
1
  1. Immutable keys cannot change. Consequently, the hashcode which is computed at the time of insertion cannot change. So when you try to get an element from the map, the hashcode of the object to get is computed against known hashcodes. Had your key changed from outside(it was mutable), the new key's hashcode would be different from the one which you inserted.

  2. Let's see an example. for(2 and 4)

    public class RandomPair {
        int p;
        int q;
    
        public RandomPair(int p, int q) {
            this.p = p;
            this.q = q;
        }
        @Override
        public int hashCode() {
            return 31 * p + q;
        }
    
        @Override
        public boolean equals(Object obj) {
            if (!(obj instanceof RandomPair)) {
                return false;
            }
            if (obj == this) {
               return true;
            }
    
            RandomPair other = (RandomPair) obj;
            if (p != other.p)
                return false;
            if (q != other.q)
                return false;
            return true;
        }
    
        public static void main(String[] args) {
            RandomPair pair = new RandomPair(10, 10);
            Map<RandomPair, Integer> map = new HashMap<RandomPair, Integer>();
    
            map.put(pair, 1);
            System.out.println(map.get(pair)); //returns 1
    
            //someone somewhere just changed the value of pair
            pair.p = 20;
            //the object was the same, someone somewhere just changed value of pair and now you can't 
            //find it in the map
            System.out.println(map.get(pair));
    
            //had you made p and q final, this sort of modification wouldn't be possible
           //Strings are immutable and thus prevent this modification
        }
    }
    
  3. Since strings are immutable, the hashcode value once computed can be reused again. The hashcode is lazily computed. ie on the first call to hashcode and then the value of hashcode is cached.

bsd
  • 2,707
  • 1
  • 17
  • 24
  • Thanks bsd. Can you also add a few lines about what my manager is talking about? We are receiving our data from a customer as a JSON string. So they will be changing the address for sure. I want to understand how that can affect the logic I have? – rickygrimes Aug 03 '13 at 06:59
  • Do not keep address a part of computing `hashcode` and `equals`. Fields which are constants, may be their unique 10 digit identification number(may be SSN) should be part of `hashcode`. Names may change, addresses may change. If you can can find out what changed in the `Person` object, you can remove that item from the map and re-insert with the new fresh values. – bsd Aug 03 '13 at 07:05
0

In general, keys in hashmaps should be immutable .

See this

Note: great care must be exercised if mutable objects are used as map keys. The behavior of a map is not specified if the value of an object is changed in a manner that affects equals comparisons while the object is a key in the map.

The hash of your key is calculated once during insert, the hashmap stores the it and it will not get automatically updated once your key is modified. That's why there's an a assumption that keys will be immutable.

Your options are: 1. Don't use mutable objects as keys. Try to find another key, or use an immutable part of your former key object 2. Don't change your mutable objects while they are used as keys

Vadim
  • 2,847
  • 15
  • 18
0
  1. A mutable key or object means that you can modify the object [by modify i mean you can change values represented by object]. This will impact its storage in HashMap if the logic written in equals and hashcode uses these modifiable value.

  2. Immutability ideally means that object once initialized cannot be changed after that. But if we talk specifically in terms of HashMap then all the variables which are used inside equals and hashcode method, if they can be modified then that object should not be used as key else it can be used as key [but still not recommended].

  3. Its not just about String, any about would cache its hashcode. Hashcode is generated again and again for almost all objects [There is a reason why i say almost as in some cases it can change]. Hashcode is cached in Object header.

  4. If you want to use mutable object as key then you should go for IdentityHashMap. Just read about them, they can be useful in such cases.

Lokesh
  • 7,810
  • 6
  • 48
  • 78