7

Just curious, in String's hashCode implementation what is the reason behind extra references creation in a hashCode implementation (v 1.8.0_65):

public int hashCode() {
    int h = hash;
    if (h == 0 && value.length > 0) {
        char val[] = value;

        for (int i = 0; i < value.length; i++) {
            h = 31 * h + val[i];
        }
        hash = h;
    }
    return h;
}

Taking into consideration that value is final and created in constructor only (i.e. threadsafe) why do we need variable val[] reference here?

I.e. will this work:

public int hashCode() {
    if (hash == 0 && value.length > 0) {
        int h = 0;
        for (int i = 0; i < value.length; i++) {
            h = 31 * h + value[i];
        }
        hash = h;
    }
    return hash;
}

?

In addition to copying values from heap to stack to speed things up it is also about race conditions described in comments by @zapl. Which was not obvious to me before his comment.

  • This is not about avoiding getopt. This is http://jeremymanson.blogspot.de/2008/12/benign-data-races-in-java.html - a data race because `hash` is neither `volatile` nor being synchronized. That's why they don't `return hash` (there is no guarantee about the value you're reading) – zapl Oct 30 '15 at 12:28
  • I disagree with this statement. hash is int (not long), i.e. all ops are atomic and in the worst case you'll just calculate it twice which might happen in both cases. – Konstantin Kulagin Oct 30 '15 at 14:51
  • It's not about atomically setting the value but about operation (re)ordering. From linked article about your version: *What I've done here is to add an additional read: the second read of hash, before the return. As odd as it sounds, and as unlikely as it is to happen, the first read can return the correctly computed hash value, and the second read can return 0! This is allowed under the memory model because the model allows extensive reordering of operations. The second read can actually be moved, in your code, so that your processor does it before the first!* – zapl Oct 30 '15 at 15:53
  • Jeremy Manson is one of the authors of the JMM. I guess he knows what he's blogging there :) TBH, I don't understand the condition in which it's legal to reorder that read so that it happens when the value is still 0. – zapl Oct 30 '15 at 19:37
  • Still, I tend to disagree, we don't have any conditions inside our (if) block and taking into consideration that even after reordering it must work correctly in one thread I would say it should work in the same way. We can use chat for further discussion. And in any case thank you for this comment - it forced me to recall a lot of stuff :) – Konstantin Kulagin Oct 30 '15 at 19:39
  • Sorry about a little mess with comments. But even if we do not consider `hash` variable why do we copy `value` which is final? :) – Konstantin Kulagin Oct 30 '15 at 19:55
  • 1
    I guess that's indeed avoiding the repeated load of the field. Unlikely to be necessary with modern JVMs but that method is also using code that is quite old. http://docs.oracle.com/cd/E13150_01/jrockit_jvm/jrockit/geninfo/diagnos/underst_jit.html#wp1077986 for example shows that jrockit (which has merged with hotspot since java 7 or so) can certainly cache field references locally for you (and that's just one of the simpler things it does). – zapl Oct 30 '15 at 20:13
  • Actually I would like to mark your answer as a correct one as well which I cannot do with comment. Thanks again for the help! – Konstantin Kulagin Nov 05 '15 at 14:16

1 Answers1

1

It seems like the intent is to put hash and the value handle explicitly on the stack

ControlAltDel
  • 33,923
  • 10
  • 53
  • 80