1

In the javadocs for Guava's ImmutableMap it says:

Performance notes: unlike HashMap, ImmutableMap is not optimized for element types that have slow Object.equals(java.lang.Object) or Object.hashCode() implementations. You can get better performance by having your element type cache its own hash codes, and by making use of the cached values to short-circuit a slow equals algorithm.

So my first question is how do I know if my element has a slow .equals or .hashCode implementation? In my specific instance, I'm using a java Enum as my key, so that has an efficient default implementation of .equals and .hashCode, right? (I assume the implementation of those for the values is irrelevant so long as you aren't accessing the map using the values' values.)

My second question is what does "having your element type cache its own hash codes" even mean! Googling around I couldn't seem to find an example of how you would do that. I assume maybe that means you end up with hashcodes within hashcodes? So I get into the hashcode bucket, and then the .equals methods uses a second set of hashcodes within that?

CorayThan
  • 17,174
  • 28
  • 113
  • 161

3 Answers3

4

I'm using a java Enum as my key, so that has an efficient default implementation of .equals and .hashCode, right?

Neither equals or hashcode are overriden so it could hardly be much faster (i.e. equals returns this == other).

My second question is what does "having your element type cache its own hash codes" even mean!

You could have something like the code below, to avoid multiple calculation - if you do that, you need to make sure that:

  • the hashcode is invariant (i.e. can't change throughout the life of an instance)
  • your hashcode method is thread safe (which can be done with the use of a local variable for example or more simply by making hash volatile).
class MyClass {
    private int hash;

    public int hashcode() {
        int hash = this.hash;
        if (hash == 0) {
            hash = calculateIt();
            this.hash = hash;
        }
        return hash;
    }
}
assylias
  • 321,522
  • 82
  • 660
  • 783
  • 2
    If the object is immutable and hashcode() / equals() is often called, you can even compute the hashcode during the object's construction and not worry about thread-safety anymore. – Frank Pavageau Mar 16 '13 at 22:50
  • @FrankPavageau Absolutely - that is much easier and safer indeed. – assylias Mar 16 '13 at 23:14
2

If your key is an Enum then this is not an issue for you.

What they are talking about with caching hash codes would be something like:

private volatile int hashCode = -1;

@Override
public int hashCode() {
  if ( hashCode == -1 ) {
    hashCode = longCalculation();
  }
  return hashCode;
}
OldCurmudgeon
  • 64,482
  • 16
  • 119
  • 213
  • 1
    Of course, you have to be careful that -1 isn't a valid hash value. `java.lang.String` has this problem: some strings will recalculate their hashes every time. – Antimony Mar 16 '13 at 21:21
  • 1
    @assylias - You are right - to an extent. Actually it **is** safe but it **is** possible to call the `longCalculation` multiple times if multiple threads request the `hashCode` at the same time. If that is a critical issue then I would certainly put more effort into this method in a multi-threaded environment. It does, however, clearly demonstrate the point I am making. – OldCurmudgeon Mar 16 '13 at 21:26
  • So what they mean is you save the hashcode as an instance variable so you don't need to repeatedly calculate it? Basically, you sacrifice some memory (extra instance variable) to enhance the performance by caching the value. – CorayThan Mar 16 '13 at 21:29
  • @CorayThan - Yes! That is what they mean by caching the value. – OldCurmudgeon Mar 16 '13 at 21:39
  • @OldCurmudgeon Actually technically it could erroneously return -1 because of instructions reordering. This is why String#hashcode uses a local variable for example. – assylias Mar 16 '13 at 23:14
  • @assylias - FTFY (in a much more natural way I suspect) - I still fail to see how thread safety is a factor here. IMHO If we tried to make all of our code thread safe we would soon bring productivity to a halt. – OldCurmudgeon Mar 17 '13 at 00:00
  • 1
    @Antimony relevant: http://stackoverflow.com/questions/2310498/why-doesnt-strings-hashcode-cache-0/ – Kevin Bourrillion Mar 17 '13 at 03:03
  • @assylias Actually, even if `hashCode` isn't volatile, it will still never erroneously return -1. Instructions in the same thread are guaranteed to executed as if in program order. At worst, you'll have longCalculation called multiple times, but it will never actually return -1. – Antimony Mar 17 '13 at 13:32
  • @Antimony Not when using a shared variable with improperly synchronized accesses. See for example: http://jeremymanson.blogspot.co.uk/2008/12/benign-data-races-in-java.html (third code example). – assylias Mar 17 '13 at 18:13
0

I know this is very old question, but for those who got here like me - you really should be using an EnumpMap, if your key is an enum.

Jan Mares
  • 795
  • 10
  • 22