Why extra var2 byte array is used in hashCode method of StringLatin1 utility class?

Question

Current code is:

public static int hashCode(byte[] value) {
    int h = 0;
    byte[] var2 = value;
    int var3 = value.length;

    for(int var4 = 0; var4 < var3; ++var4) {
        byte v = var2[var4];
        h = 31 * h + (v & 255);
    }

    return h;
}

Possible code is:

public static int hashCode(byte[] value) {
    int h = 0;
    int var2 = value.length;

    for(int var3 = 0; var3 < var2; ++var3) {
        byte v = value[var3];
        h = 31 * h + (v & 255);
    }

    return h;
}

In java.lang package, there is a utility class called StringLatin1. This class has hashCode method which will be called from String class's hashCode method, if current string value is latin.

PS: I use Java 11.

score 2 · Accepted Answer · answered Jan 26 '20 at 09:50

2

Whatever current code you have posted is not the real code; it's the decompiled code which may vary from decompiler to decompiler and therefore you can not rely on it.

answered Jan 26 '20 at 09:50

Arvind Kumar Avinash

71,965
6
74
110

Where I can get a real code. Can you please give link to a open source repository on github? – Seydazimov Nurbol Jan 26 '20 at 09:53
Please check http://hg.openjdk.java.net/jdk8/jdk8/jdk/file/tip/src/share/classes/java. You can also check https://stackoverflow.com/questions/261015/where-can-i-see-the-source-code-of-the-sun-jdk – Arvind Kumar Avinash Jan 26 '20 at 09:55
1

@SeydazimovNurbol OpenJDK projects use Mercurial: https://hg.openjdk.java.net/. Here's `StringLatin1` for JDK 11: https://hg.openjdk.java.net/jdk/jdk11/file/1ddf9a99e4ad/src/java.base/share/classes/java/lang/StringLatin1.java#l193. Also, the JDK typically comes with a `src.zip` file which contains many of the _Java_ source files (but not any of the native source files). – Slaw Jan 26 '20 at 09:57
@slaw how people contributes to Java? Is there a open source repo in github? – Seydazimov Nurbol Jan 26 '20 at 10:18
1

@SeydazimovNurbol Read-only mirror on GitHub: https://github.com/openjdk/jdk. Information I could find about contributing: https://openjdk.java.net/contribute/ – Slaw Jan 26 '20 at 10:30
1

Why don’t you check your JDK folder for the `lib/src.zip` first? – Holger Jan 27 '20 at 17:08

score 1 · Answer 2 · answered Jan 27 '20 at 17:31

This is the standard pattern of the for-each loop.

When you write

for(Type variable: expression) {
    // body
}

the expression will be evaluated exactly once at the beginning of the loop and the resulting collection or array reference is remembered throughout the loop. This also implies, that if expression is a variable and this variable is assigned within the loop body, it has no effect on the ongoing loop.

The relevant part of the specification says:

…
Otherwise, the Expression necessarily has an array type, T[].

Let L1 ... Lm be the (possibly empty) sequence of labels immediately preceding the enhanced for statement.

The enhanced for statement is equivalent to a basic for statement of the form:
T[] #a = Expression;
L1: L2: ... Lm:
for (int #i = 0; #i < #a.length; #i++) {
    {VariableModifier} TargetType Identifier = #a[#i];
    Statement
}
#a and #i are automatically generated identifiers that are distinct from any other identifiers (automatically generated or otherwise) that are in scope at the point where the enhanced for statement occurs.

TargetType is the declared type of the local variable in the header of the enhanced for statement.

If you compare the decompiled version

public static int hashCode(byte[] value) {
    int h = 0;
    byte[] var2 = value;
    int var3 = value.length;

    for(int var4 = 0; var4 < var3; ++var4) {
        byte v = var2[var4];
        h = 31 * h + (v & 255);
    }

    return h;
}

with the actual source code

public static int hashCode(byte[] value) {
    int h = 0;
    for (byte v : value) {
        h = 31 * h + (v & 0xff);
    }
    return h;
}

you will recognize the translation. var2, var3, and var4 are all synthetic variables. Things to note:

In principle, a compiler could analyze the scenario to recognize that value is a local variable which is not assigned in the loop body, so no additional variable would be needed here. But the savings compared to following the standard translation strategy have not been considered worth implementing the additional logic.
Likewise, it’s the compilers decision whether to remember the invariant array size in another local variable. As shown above, the specification does not mandate it.

You could say that it is a weakness of the decompiler not to recognize the for-each loop and translate it back, however, there’s generally an ambiguity when trying to map compiled code to source code constructs, as a lot of variants exist to produce the same code.

Why extra var2 byte array is used in hashCode method of StringLatin1 utility class?

2 Answers2