3

Java 5 gave us the for-each loops, which should be used where ever possible.

But what is the most efficient idiom if you need to use the index of the array in the block?

// Option (1)
for (int i = array.length - 1; i >= 0; i--) {
    // Code.
}

// Option (2)
for (int i = 0; i < array.length; i++) {
    // Code.
}

// Option (3)
for (int i = 0, n = array.length; i < n; i++) {
    // Code.
}

(Obviously, this won't make a big performance difference in most programs but humor me.) :-)

  1. loops backwards and is hideous. Maybe even cache unfriendly? Or can modern processer detect striding backwards in memory?

  2. is shorter and I can see how the JIT compiler can figure out that array never changes and thus length is constant so it can essentially replace it with (3). But does it do that? (Assume the JVM is Oracle's Hotspot/Java 7.)

  3. is suggested by Joshua Bloch in Item 45 in Effective Java as the fastest option if it's some Collection.size() that's the upper bound. But does it also apply to arrays? From the bytecode (see below) I can see that one save an arraylength instruction per cycle (pre-optimized).

This question about for loops in the Dalvik Virtual Machine, lists (1)-(3) as fastest to slowest. However, the information is from 2008 and Dalvik is much more mature today so I hardly think that's still the case.

Looking at the bytecode generated from the above examples, there are obvious differences:

Compiled from "ForLoops.java"
public class ForLoops extends java.lang.Object{
static int[] array;

public ForLoops();
  Code:
   0:   aload_0
   1:   invokespecial   #10; //Method java/lang/Object."<init>":()V
   4:   return

public static void forLoop1();
  Code:
   0:   getstatic   #17; //Field array:[I
   3:   arraylength
   4:   iconst_1
   5:   isub
   6:   istore_0
   7:   goto    13
   10:  iinc    0, -1
   13:  iload_0
   14:  ifge    10
   17:  return

public static void forLoop2();
  Code:
   0:   iconst_0
   1:   istore_0
   2:   goto    8
   5:   iinc    0, 1
   8:   iload_0
   9:   getstatic   #17; //Field array:[I
   12:  arraylength
   13:  if_icmplt   5
   16:  return

public static void forLoop3();
  Code:
   0:   iconst_0
   1:   istore_0
   2:   getstatic   #17; //Field array:[I
   5:   arraylength
   6:   istore_1
   7:   goto    13
   10:  iinc    0, 1
   13:  iload_0
   14:  iload_1
   15:  if_icmplt   10
   18:  return

}
Community
  • 1
  • 1
Nicholas
  • 2,147
  • 3
  • 23
  • 31
  • In theory, 1 and 3 are more efficient that 2, in the general case, since `array.length` need only be evaluated once. However, evaluating `length` in Java is quite simple, especially if JITCed, and so the theoretical difference all but vanishes in practice. Most likely it depends on what's in the loop body and how the JITC optimizer melds loop control with the loop body. – Hot Licks Jul 07 '13 at 21:38
  • @Nicholas Nice work pulling out the bytecode! But it's important to bear in mind that the bytecode isn't ultimately what gets executed due to the JIT. If we could get JITed instructions, that *would* be the final answer, but that's probably unrealistic because (a) it's platform-specific, and (b) there are different "levels" of JITing. Also, I don't know of any JIT that gives any transparency into code generated during runtime, at least not without poking around in the JVM memory space yourself. – sigpwned Jul 08 '13 at 15:51
  • I should of course be pointed out that the "most efficient" choice is generally the one that best fits the rest of the code. For instance, iterating backwards is sometimes awkward and sometimes very useful. Likewise, checking `array.length` on each iteration may actually be required, if the array object gets replaced inside the loop. – Hot Licks Jul 10 '13 at 02:34
  • Good point. But this question is about the typical case which many people use. If the straight forward solution is the most efficient then a lot of people could continue coding the way they do with peace in mind (myself included). – Nicholas Jul 10 '13 at 06:55

1 Answers1

1

You could easily test this for yourself; if you do, you should probably have a look at these tips about performance testing from the HotSpot creators themselves. This answer may be useful too. If you do decide to test these implementations, please let us know what you find!

However, on the whole, you shouldn't worry about these things too much. Instead, focus on writing readable code and getting things done. Most of the time, you'll find that your code runs Fast Enough without any "tricks." Modern hardware is very fast, and the JIT is pretty good, too.

If you do find that your code is running too slow, profile first, then optimize. Anything else is premature. And remember, from a man who is smarter than any of us:

"Premature optimization is the root of all evil." -- Donald Knuth

EDIT: Since you seem curious about this less in terms of "how should I write my code?" and more in terms of a thought experiment, I'd expect all of these options to run at more or less identical speeds.

None of those loops are likely to tweak out the branch predictor (for reasonably-sized arrays, anyway). I'd expect the underlying JIT to transform any recurring array length references from (2)-style to (3)-style. All things being equal, (1)'s cache performance isn't worse than (2)'s or (3)'s simply because it runs backwards; for a given array, the same cache lines will get loaded and hit (or not hit) just as often.

Of course, my expectations are irrelevant. The only way to know is to test! When testing, though, remember that writing good microbenchmarks is hard.

Community
  • 1
  • 1
sigpwned
  • 6,957
  • 5
  • 28
  • 48
  • 1
    I could test it, but I would rather have generel insight into how the JVM functions. Let's also assume that it's performance critical. – Nicholas Jul 07 '13 at 20:03
  • @Nicholas I feel you wanted to write code in C language but asked to use Java :) – Grijesh Chauhan Jul 07 '13 at 20:04
  • No not at all - I think Java is great but sometimes less deterministic when it comes to performance. (Biggest culprit being the GC of course, which is also the biggest blessing when it comes to producing new features fast). – Nicholas Jul 07 '13 at 20:07
  • @Nicholas OK, ..my answer is:: by definition arrays are in continue memory allocation(so --/++ doesn't matter) and `.length ` is not function call so I think all should equally fast. Also consider optimization-phase. – Grijesh Chauhan Jul 07 '13 at 20:09
  • @GrijeshChauhan -- Actually, to an optimizer `++` vs `--` can matter quite a bit when addressing arrays, as one must consider overflows and other issues as computations are moved. However, to predict which would be better would require, at the very least, knowledge of the particular optimizer algorithms and target instruction set. – Hot Licks Jul 07 '13 at 21:41