28

The "Performance Tips" section in the Android documentation has a pretty bold claim:

one() is faster. It pulls everything out into local variables, avoiding the lookups. Only the array length offers a performance benefit.

where it refers to this code snippet:

int len = localArray.length;

for (int i = 0; i < len; ++i) {
    sum += localArray[i].mSplat;
}

This surprised me a lot because localArray.length is just accessing an integer and if you'd use an intermediate variable, you'd have to do that exact same step again. Are we really saying that an intermediate variable that only has to go to x instead of y.x is faster?

I took a look over at this question which is about the same idea but uses an arraylist and its subsequent .size() method instead. Here the concensus seemed to be that there would be no difference since that method call is probably just going to be inlined to an integer access anyway (which is exactly the scenario we have here).

So I took to the bytecode to see if that could tell me anything.

Given the following source code:

public void MethodOne() {
    int[] arr = new int[5];
    for (int i = 0; i < arr.length; i++) { }
}

public void MethodTwo() {
    int[] arr = new int[5];
    int len = arr.length;
    for (int i = 0; i < len; i++) { }
}

I get the following bytecode:

public void MethodOne();
    Code:
        0: iconst_5
        1: newarray       int
        3: astore_1
        4: iconst_0
        5: istore_2
        6: iload_2
        7: aload_1
        8: arraylength
        9: if_icmpge     18
        12: iinc          2, 1
        15: goto          6
        18: return

public void MethodTwo();
    Code:
        0: iconst_5
        1: newarray       int
        3: astore_1
        4: aload_1
        5: arraylength
        6: istore_2
        7: iconst_0
        8: istore_3
        9: iload_3
        10: iload_2
        11: if_icmpge     20
        14: iinc          3, 1
        17: goto          9
        20: return

They differ in the following instructions:

Method one

6: iload_2
7: aload_1
8: arraylength
9: if_icmpge     18
12: iinc          2, 1
15: goto          6
18: return

Method two

9: iload_3
10: iload_2
11: if_icmpge     20
14: iinc          3, 1
17: goto          9
20: return

Now, I'm not 100% sure how I have to interpret 8: arraylength but I think that just indicates the field you're accessing. The first method loads the index counter and the array and accesses the arraylength field while the second method loads the index counter and the intermediate variable.

I benchmarked the two methods as well with JMH (10 warmups, 10 iterations, 5 forks) which gives me the following benchmarking result:

c.m.m.Start.MethodOne    thrpt        50  3447184.351    19973.900   ops/ms
c.m.m.Start.MethodTwo    thrpt        50  3435112.281    32639.755   ops/ms

which tells me that the difference is negligible to inexistant.


On what is the Android documentation's claim that using an intermediate variable in a loop condition, based?

Community
  • 1
  • 1
Jeroen Vannevel
  • 43,651
  • 22
  • 107
  • 170
  • Might be because n is fixed, while arrayName.length() will be evaluated each iteration. Not entirely sure, though. – Stultuske Aug 14 '15 at 13:26
  • Java holds arrays as well as string lengths as inside variable - not being evaluated at every call (can't find any reference now - some1 please confirm or denial) – itwasntme Aug 14 '15 at 13:28
  • 3
    Perhaps the advice applies to an older JIT? – harold Aug 14 '15 at 13:30
  • 4
    `arraylength` is *not* the name of a field. It's an [actual JVM instruction](https://docs.oracle.com/javase/specs/jvms/se8/html/jvms-6.html#jvms-6.5.arraylength), which performs a pop, a dereference and a push. – RealSkeptic Aug 14 '15 at 13:53
  • Bytecode is different, but this is not what (modern) JVM executes anyway. It normally compiles bytecode into native instructions internally. I believe array length access should be moved out of the loop there, as it is guaranteed to stay constant throughout the loop. –  Aug 14 '15 at 14:33
  • I am not sure your byte code mimics the issue in the quoted document. First, their `one` improves over their `two` by accessing a local array over a *field*. Second, they actually have access to the array elements, meaning there is going to be bounds-checking within the loop. – RealSkeptic Aug 14 '15 at 15:20
  • 1
    I remember doing something similar and that the second version was actually slower than the first one. The reason was probably that the first one didn't perform any bound checks, except for the first and last element. I don't recall the version of java I was running – George Aug 14 '15 at 15:23

2 Answers2

13

You misunderstood the documentation. They are not referring to what you have described (although I don't blame you, they should've put more effort into those docs :)).

It pulls everything out into local variables, avoiding the lookups.

By avoiding the lookups they refer to field vs local variable access cost. Accessing the field (mArray in the example in the docs) requires loading this first, then loading the field based on the fixed offset from this.

After a while, JIT will probably figure out what's going on and optimize the field access as well (if the field is not volatile or some other form of synchronization happens in the loop) and rewrite the code so that all of the variables participating in the loop are accessed/changed in the CPU registers and caches until the end of the loop.

Generally, it could be potentially much more work for JIT to figure out whether it is safe to optimize the access to the length of the array referenced from a field compared to the reference stored in a local variable. Let's say we have the following loop:

for (int i = 0; i < array.length; ++i) {
    process(array[i]);
}

If array is a field and process invokes thousands of lines of complex code, then JIT may find it difficult to check whether the array field is changed somewhere in the loop to reference some other array which has a different length.

Obviously, it is much easier to check whether the local variable is changed in this case (three lines of code).

Community
  • 1
  • 1
Dragan Bozanovic
  • 23,102
  • 5
  • 43
  • 110
  • Well, they specifically say that the JIT can't do this for the field access. But I agree that this is a misunderstanding of the document. – RealSkeptic Aug 14 '15 at 15:25
  • Also, JIT wasn't always there. Correct me if I'm wrong, but it appeared since 2.3. – Dmitry Zaytsev Aug 14 '15 at 15:46
  • 1
    @RealSkeptic True, thanks for the remark. But generally, there are no obstacles for JIT to do it; maybe it's the current implementation for Android. Java memory model allows it. [Here (section 2.3)](http://docs.oracle.com/cd/E15289_01/doc.40/e15058/underst_jit.htm) is an example of such an optimization for JRockit JVM. – Dragan Bozanovic Aug 14 '15 at 15:49
  • 1
    @DmitryZaitsev Then that could explain what they meant in the docs - field access in loops can be costly if the current Android JIT can't optimize it (or if there is no JIT at all, of course :)). – Dragan Bozanovic Aug 14 '15 at 15:56
  • It's still unclear to me how this relates to the statement "Only the array length offers a performance benefit.". It says it pulls everything out into local variables. When we look at the code, we see that "everything" refers to the field array and the array length. Then they proceed by saying only the array length makes a difference. However in your answer you're focusing on the other local variable? – Jeroen Vannevel Aug 14 '15 at 16:00
  • Yes, I was about to ask the same thing. If the problem is field access, then every index access inside the loop should offer a performance benefit when the array variable is local, not just the length lookup. – RealSkeptic Aug 14 '15 at 16:09
  • True, as I said those docs seem to be written in some kind of a hurry. Just look at the sentence: _"Only the array length offers a performance benefit."_ What does that mean? The fact that the array has a length offers a performance benefit? :) Maybe they mean that using the array length local variable _only_ (without a local array reference) would _also_ bring a performance improvement. Who knows... – Dragan Bozanovic Aug 14 '15 at 16:18
  • 2
    Interestingly, the OP ignored the other statement that method `two()` (the document has three methods starting with `zero()`) using a `for (Foo a : mArray)` loop is even faster on systems without JIT. That’s much more confusing as it depends on the actual compiler whether the generated byte code has any difference between `one()` and `two()` at all. I smell esoteric optimization techniques… – Holger Aug 18 '15 at 15:12
0

Actually no it didn't make the loop faster, your idea is right when using the String.length()
the difference is that array.length is just a field which have a value that you just use it directly..
String.length() is a method which need a time to be executed.