7

When using a HashMap, how much does the object type matter for element retrieval when it comes to speed? Say I use a loop to iterate through possible keys of a large hash map. What would be the most efficient key type I could use?

As of now I am using a String for the key object type due to simplicity for my sake. While coding, this question popped up in my head and struck my curiosity. I attempted searching this question online, but couldn't quite find the answer I was looking for. Thanks!

Dillon Burton
  • 363
  • 6
  • 16
  • 1
    I dont think oject type of key matters in speed in HashMap. Size matters not object type. – Achintya Jha Apr 23 '13 at 06:30
  • Really? When you think about it, it seems that it would matter. There has to be some comparing going on, checking to see if the requested key exists in the hash map. Does it possibly look for the memory location, or does it work in a different way? Now I am very curious to know. Although, I do believe you XD an explanation would be awesome though. – Dillon Burton Apr 23 '13 at 06:33

5 Answers5

5
  1. Key hashCode() and equals() should be fast

  2. hashCode() should be well-distributed to minimize hash collisions

Andrew Tobilko
  • 48,120
  • 14
  • 91
  • 142
Evgeniy Dorofeev
  • 133,369
  • 30
  • 199
  • 275
  • 1
    2: Note that the implementation of java.util.HashMap uses some bit-shuffling to make sure the keys are evenly distributed even if your hashcodes are not that good. You still have to make sure they are distinct though.. Also see this post: http://stackoverflow.com/questions/2414117/explanation-of-hashmaphashint-method – rethab Apr 23 '13 at 06:48
3

The hash map would ask your key for a hashCode(). If the time taken to generate a hash code is unreasonable, then insertion and retrieval times for such objects would be high. Take java.net.URL for example. It's hashcode method performs a DNS lookup. Such objects would not make good keys for a hash map.

There is no universal answer to which is the best key, since there is no best key. The best key for use in your hash map is one that you need for retrieval. Just make sure the key's hashCode() is quick and uses the int space appropriately.

Deepak Bala
  • 11,095
  • 2
  • 38
  • 49
  • 1
    I hate to sound like a four year old, but why is there no best key? What is going on down in the lower level where the object type wouldn't matter? – Dillon Burton Apr 23 '13 at 06:38
  • Efficiency is affected by hashCode and equals, as indicated in some of the answers here. But the point that there is no best key is that you likely have a reason for using a hash data structure. And if you already have an existing type that you need to use to access your data quickly, that's the type you'll use for your key :-) – Kirby Apr 23 '13 at 06:40
  • The object type does not matter as much as its `hashCode()` matters. The hash map takes the hashCode and applies a hash function to it. The hash function reduces the hashCode from the `int` space to an index between 0 -> N where N is the size of an array used to store the object. What matters at the end of the day is the `hashCode` impl. @Kirby has explained the point I was trying to make. – Deepak Bala Apr 23 '13 at 06:42
  • 2
    As all the answers imply, the key is not stored itself. The key's hashCode() method is invoked to retrieve an int value. This int value is then used internally as the key. If two key objects have the same hashCode() result, further steps are taken internally to remove the collision and disambiguate the key/value pairs. Of course, the hashCode() method for Integers does not have much computational overhead. Thus, arguably, Integers are 'best' for use as keys. However, your program's needs may override the use of Integers as keys. That's what the hashCode methods are for! – Mike Adler Apr 23 '13 at 06:42
  • @Mike Adler I understand now. The way you explain how the type matters when two objects have the same hash code makes a lot more sense. Thank you! – Dillon Burton Apr 23 '13 at 06:44
1

What matters is the implementation of the equals and hashCode methods. See the following: What issues should be considered when overriding equals and hashCode in Java?

As these functions are utilized in hash operations, their efficiency comes into play when you operate on your collection.

As a sidenote, keep in mind the point made in the reference link:

Make sure that the hashCode() of the key objects that you put into the collection never changes while the object is in the collection. The bulletproof way to ensure this is to make your keys immutable, which has also other benefits.

Community
  • 1
  • 1
Kirby
  • 3,649
  • 3
  • 26
  • 28
1

What matters in your case is the speed of the hashCode method of the elements and of the equals method. Using Integers is fine, as they don't require any special computation for the hash value. Strings are also ok, because the hash value is cached internally, although they perform slower with equals.

GOTO 0
  • 42,323
  • 22
  • 125
  • 158
0

Are you trying to retrieve values from the HashMap via the get method or by iterating over it? As for the get method, all the people above have answered this.

If you are iterating over a HashMap via the entrySet method, the type of the keys in your HashMap doesn't matter. Besides, having an entry of the entrySet at hand in each iteration, looking up values becomes useless. Also note that entrySet is in generally preferable to the values method and keySet method as they both internally use the entrySet iterator and return either the key or value of an entry.

rethab
  • 7,170
  • 29
  • 46