8

Consider the following function:

int foo(int[] indices) {
  int[] lookup = new int[256];
  fill(lookup); // populate values, not shown

  int sum = 0;
  for (int i : indices) {
    sum += lookup[i & 0xFF]; // array access
  }

  return sum;
}

Can modern HotSpot eliminate the bounds check on the lookup[i & 0xFF] access? This access cannot be out of bounds since i & 0xFF is in the range 0-255 and the array has 256 elements.

Jorn Vernee
  • 31,735
  • 4
  • 76
  • 93
BeeOnRope
  • 60,350
  • 16
  • 207
  • 386
  • Isn't that rather something that the compiler should optimize? I think HotSpot is more about optimization in context of "a function is called often, so let's make the call faster". – akuzminykh Apr 10 '21 at 22:02
  • @akuzminykh - in Java the compiler does very few optimizations. Almost all of the standard compiler optimizations are applied at runtime by the JIT. That doesn't really answer whether it "should" be this way, but just how it is in practice. In this case, I don't think the compiler _can_ optimize it: the bounds check is implied in the array access bytecode (unlike compile-to-native compilers where the bounds check would use some explicit instructions before the access): so absent some kind of "unsafe array access" bytecode, only the JVM can optimize this. – BeeOnRope Apr 10 '21 at 22:10

1 Answers1

8

Yes, this is a relatively simple optimization, which HotSpot definitely can do. JIT compiler deduces possible ranges of expressions, and uses this information to eliminate redundant checks.

We can verify this by printing the assembly code: -XX:CompileCommand=print,Test::foo

...
0x0000020285b5e230: mov     r10d,dword ptr [rcx+r8*4+10h]  # load 'i' from indices array
0x0000020285b5e235: and     r10d,0ffh                      # i = i & 0xff
0x0000020285b5e23c: mov     r11,qword ptr [rsp+8h]         # load 'lookup' into r11
0x0000020285b5e241: add     eax,dword ptr [r11+r10*4+10h]  # eax += r11[i]

There are no comparison instructions between loading i and lookup[i & 0xff].

apangin
  • 92,924
  • 10
  • 193
  • 247
  • 2
    Surprising to see the load into r11 there, it is invariant so shouldn't it be moved out of the loop? – amonakov Apr 11 '21 at 06:45
  • 1
    Looking at this yesterday with IGV and wondering why I don't see the range check after parsing, but it turns out the range check is never even emitted: https://github.com/openjdk/jdk/blob/master/src/hotspot/share/opto/parse2.cpp#L140-L144 Another way to find this is to use `-XX:+LogComilation` and look for `` for the `iaload` instruction (can match up bci with javap output). So yeah, really a very simple optimization :) – Jorn Vernee Apr 11 '21 at 11:57
  • @amonakov Indeed, there is a redundant load on JDK 8. When I ran the same code on JDK 11, it went away (so that the loop body consisted of just 3 instructions). – apangin Apr 11 '21 at 12:45