7

I had a look at the source code of the String.hashcode() method. This was the implementation in 6-b14, which has been changed already.

public int hashCode() {
        int h = hash;
        if (h == 0) {
            int off = offset;
            char val[] = value;
            int len = count;

            for (int i = 0; i < len; i++) {
                h = 31*h + val[off++];
            }
            hash = h;
        }
        return h;
}

My question is about this line:

int len = count;

Where count is a global variable representing the amount of characters of the String.

Why is a local variable len used here for the loop condition and not the global variable itself? Because there is no manipulating of the variable, only reading. Is it just good practice to use a local variable if a global field is used for reading from or writing to it? If the answer is yes why for reading too?

Steve Benett
  • 12,843
  • 7
  • 59
  • 79

3 Answers3

2

Access to a local variable ist faster than to an instance variable. Also the new Java 8 Code (see Anubians answer) take account of this. This is the reason why they use a local variable h for the hash calculations and don't access the instance variable this.hash directly and create the local pointer char val[] = value;. But with this in mind I don't know why they don't use i < val.length; or even better z = val.length; i < z; in the for loop but i < value.length;.

2

Poking around in the String class I've found a comment regarding that strange assignment to a local variable in the String.trim() method reading "avoid getfield opcode".

public String trim() {
    int len = value.length;
    int st = 0;
    char[] val = value;    /* avoid getfield opcode */

    while ((st < len) && (val[st] <= ' ')) {
        st++;
    }
    while ((st < len) && (val[len - 1] <= ' ')) {
        len--;
    }
    return ((st > 0) || (len < value.length)) ? substring(st, len) : this;
}

So the whole thing seems to be about performance, as Frank Olschewski pointed out.

In the Java bytecode an instance variable is actually referenced by object and name (using the GETFIELD instruction). Without optimization, the VM must do more work to access the variable.

A potential performance hit of the code, then, is that it uses the relatively expensive GETFIELD instruction on each pass through the loop. The local assignment in the method removes the need for a GETFIELD every time through the loop.

The JIT-optimizer might optimize the loop but it also might not, so the developers probably took the safe path enforcing it manually.

There's a separate question on Avoiding getfield opcode which has the details.

Community
  • 1
  • 1
MicSim
  • 26,265
  • 16
  • 90
  • 133
1

If count can be modified, then you want a local variable. If you have multithreading going on, then you want a local variable. It's safest to create a local variable. However, it's not strictly necassary.

In this case, it's overkill, since Strings are immutable anyway. The value of count can't even change.

It's pretty much useless, which is why in Java 8 it looks like this:

public int hashCode() {
    int h = hash;
    if (h == 0 && value.length > 0) {
        char val[] = value;

        for (int i = 0; i < value.length; i++) {
            h = 31 * h + val[i];
        }
        hash = h;
    }
    return h;
}

They don't even have count anymore, they're using value.length, where value is a final char array.

They are doing char val[] = value, but that's just a reference and is strictly unnecassary.

There might be some subtle microenhancement by using a local variable, or it might have been done for readability, but it's not necassary (and in my opinion less readable).

Anubian Noob
  • 13,426
  • 6
  • 53
  • 75