How can I enhance this for loop

Question

I use sublime text as my text editor and I have always written my for loops like so:

for(int i = 0; i < lengthOfSomething; i++){}

recently I was looking at the code hinter for the editor and I noticed it said "Enhaced For Loop" and it showed this to me:

for(int i = lengthOfSomething - 1; i >= 0; i--){}

I am very curious as to how this is "enhanced" so that I can know and for anyone else who might be curious?

this might interest you http://stackoverflow.com/questions/1656506/which-of-these-pieces-of-code-is-faster-in-java — Joseph Helfert, May 01 '14 at 18:54
You might be interested in that I just ran some tests using both loop formats in javascript. The performance is identical for both loop formats. It was a small sample and small number of iterations, but I would suspect that even with large numbers of iterations it would still be the same performance. — Nick Zimmerman, May 01 '14 at 19:13
That is interesting, I am sure that it has a lot to do with the language used. There does seem to be one less operation performed in @barakmanos answer — Zach Starnes, May 01 '14 at 19:17
@zachstarnes, the thing is, the compiler does necessary optimizations for your. I have that writing ++i in a for loop instead of i++ is faster, because i++ makes a copy of i, and ++i doesn't? Not sure if it is correct. But even if it is... why worry about it. That's why we have compilers so they will do optimizations for us. — santahopar, May 01 '14 at 19:37
@armanali I understand what you mean but this might be good to know if you have larger for loops with larger data objects along with making precise performance optimizations. — Zach Starnes, May 01 '14 at 19:42
@zachstarnes, the thing with for loops is that every time it does one iteration, it does an "if" to see if the condition has met or not. If your condition is very simple (e.g. comparing if i — santahopar, May 01 '14 at 19:54
@armanali I also find that as far as speed is concerned and if you are just trying to compare one value to all in the array(if that is what you are using) then a for loop is much cleaner to see and read. I recently had to check a value against all values in the array and it was much shorter to write a function with a for loop than write an if or switch statement to check all of the values. — Zach Starnes, May 01 '14 at 20:01

score 3 · Accepted Answer · answered May 01 '14 at 19:25

I doubt the answer to this question has much to do with performance. I wrote a version of each for loop in c++ and used GCC to get the resulting ASM. The "Enhanced For Loop" (version 2) actually has one more instruction subl $1, -8(%rbp) overall than the "Non Enhanced For Loop".

Either way the performance difference is negligible. This sort of code choice should be made based upon readability of the code and wether or not you want the operation inside the for loop applied on the elements in the forward direction or the reverse direction. IE searching from beginning of a list or the end of a list.

First version of the for loop:

int main(int argc, char* argv[])
{
    int lengthOfSomething = 10;
    int valueToInc = 0;

    for(int i = 0; i < lengthOfSomething; i++)
    {
        valueToInc += i;
    }
}

Resulting ASM

    .file   "main.cpp"
    .text
    .globl  main
    .type   main, @function
main:
.LFB0:
    .cfi_startproc
    pushq   %rbp
    .cfi_def_cfa_offset 16
    .cfi_offset 6, -16
    movq    %rsp, %rbp
    .cfi_def_cfa_register 6
    movl    %edi, -20(%rbp)
    movq    %rsi, -32(%rbp)
    movl    $10, -12(%rbp)
    movl    $0, -4(%rbp)
    movl    $0, -8(%rbp)
    jmp .L2
.L3:
    movl    -8(%rbp), %eax
    addl    %eax, -4(%rbp)
    addl    $1, -8(%rbp)
.L2:
    movl    -8(%rbp), %eax
    cmpl    -12(%rbp), %eax
    jl  .L3
    movl    $0, %eax
    popq    %rbp
    .cfi_def_cfa 7, 8
    ret
    .cfi_endproc
.LFE0:
    .size   main, .-main
    .ident  "GCC: (GNU) 4.8.2 20140206 (prerelease)"
    .section    .note.GNU-stack,"",@progbits

Second version of the for loop:

int main(int argc, char* argv[])
{
    int lengthOfSomething = 10;
    int valueToInc = 0;

    for(int i = lengthOfSomething - 1; i >= 0; i--)
    {
        valueToInc += i;
    }
}

Resulting ASM

    .file   "main2.cpp"
    .text
    .globl  main
    .type   main, @function
main:
.LFB0:
    .cfi_startproc
    pushq   %rbp
    .cfi_def_cfa_offset 16
    .cfi_offset 6, -16
    movq    %rsp, %rbp
    .cfi_def_cfa_register 6
    movl    %edi, -20(%rbp)
    movq    %rsi, -32(%rbp)
    movl    $10, -12(%rbp)
    movl    $0, -4(%rbp)
    movl    -12(%rbp), %eax
    subl    $1, %eax
    movl    %eax, -8(%rbp)
    jmp .L2
.L3:
    movl    -8(%rbp), %eax
    addl    %eax, -4(%rbp)
    subl    $1, -8(%rbp)
.L2:
    cmpl    $0, -8(%rbp)
    jns .L3
    movl    $0, %eax
    popq    %rbp
    .cfi_def_cfa 7, 8
    ret
    .cfi_endproc
.LFE0:
    .size   main, .-main
    .ident  "GCC: (GNU) 4.8.2 20140206 (prerelease)"
    .section    .note.GNU-stack,"",@progbits

barak manos · Answer 2 · 2014-05-01T19:39:55.260

2

The first for loop, when compared with the second for loop, contains an additional Load operation that is executed upon every iteration (when the CPU loads the value of variable lengthOfSomething).

Having said that, this Load operation will probably be eliminated if you enable compiler optimization, as the value of variable lengthOfSomething remains unchanged throughout the execution of the loop.

You might be able to understand this better by comparing the disassembly code of each loop.

In the first loop, the following operations are performed upon each iteration:

mov eax,dword ptr [i]
add eax,1
mov dword ptr [i],eax
mov eax,dword ptr [i]
cmp eax,dword ptr [lengthOfSomething]

In the second loop, the following operations are performed upon each iteration:

mov eax,dword ptr [i]
sub eax,1
mov dword ptr [i],eax
cmp dword ptr [i],0

As you can see, the first loop contains the additional mov eax,dword ptr [i] operation. This is because the CPU architecture supports the comparison of a memory content and a constant, but it doesn't support the comparison of two memory contents. Please note that the disassembly code in the example above was generated by the compiler of Microsoft Visual C++ 2010, with compiler-optimization disabled. But it is reasonable to assume that other compilers would generate similar disassembly code.

OK, so in the example above, the actual improvement is due to the fact that the first loop compares two variables (hence one of them must be loaded into a register), while the second loop compares a variable and a constant (an operation which is supported by the CPU architecture). But it still holds the general reasoning, of having fewer variable-access operations in the second loop, comparing with the first loop.

edited May 01 '14 at 19:39

answered May 01 '14 at 18:49

barak manos

29,648
10
62
114

This is only true if you are actually checking the length with each iteration. That is hard to tell from the code in the question. Best practice would be to assign the length to a variable before the loop. If that is done, then there is no extra execution occurring. – Nick Zimmerman May 01 '14 at 18:55
@Nick Zimmerman: Say what??? In the first loop, the condition `i < lengthOfSomething` is checked upon every iteration. This condition contains **2** variables that the CPU needs to load. In the second loop, the condition `i >= 0` is checked upon every iteration. This condition contains **1** variable that the CPU needs to load. – barak manos May 01 '14 at 18:57
1

@ barak manos If it is a variable, its value will be cached by the interpreter, and load won't be executed every time. If you are doing something.length then you get extra executions. – Nick Zimmerman May 01 '14 at 19:00
@Nick Zimmerman: Interpreter????? This is not Python. If anything, the value can be cached **during runtime, by the CPU**. But that's not the case here. The case is - how the compiler "chooses" to turn this source code into a list of machine operations. And in the case of `i < lengthOfSomething`, it has to add a *Load* operation for each variable (unless compiler-optimization is enabled, as I have explicitly stated in the answer). During runtime, the CPU might be able to make some advantage of caching. It might, and it might not - depending on caching heuristics. – barak manos May 01 '14 at 19:15
But the whole thing will happen only in the first loop. In the second loop, it's not going to be a problem, hence the improved performance... get it? – barak manos May 01 '14 at 19:16
This answer is very similar to the one by @JamesMcMullan – Zach Starnes May 01 '14 at 19:36
@zachstarnes: And who might that be? – barak manos May 01 '14 at 19:43
@barakmanos I was just referencing another answer that was similar to yours in that it was tested in the same language – Zach Starnes May 01 '14 at 19:44

score 0 · Answer 3 · edited May 23 '17 at 11:57

0

This is a hold over from older compilers and interpreters (on old chip architecture) that would do addition slightly slower than subtraction.

Modern compilers and interpreters (coupled with modern chip architecture) really don't have this issue.

See JavaScript loop performance - Why is to decrement the iterator toward 0 faster than incrementing for more detail on the issue.

edited May 23 '17 at 11:57

Community

1
1

answered May 01 '14 at 18:48

Nick Zimmerman

1,471
11
11

Neel · Answer 4 · 2014-05-02T04:26:46.747

As shown here :- http://www.cis.upenn.edu/~matuszek/General/JavaSyntax/enhanced-for-loops.html

The usual way to step through all the elements of an array in order is with a "standard" for loop , for example,

for (int i = 0; i < myArray.length; i++) {
    System.out.println(myArray[i]);
}

The so-called enhanced for loop is a simpler way to do this same thing. (The colon in the syntax can be read as "in.")

for (int myValue : myArray) {
    System.out.println(myValue);
}

The enhanced for loop was introduced in Java 5 as a simpler way to iterate through all the elements of a Collection (Collections are not covered in these pages). It can also be used for arrays, as in the above example, but this is not the original purpose.

Enhanced for loops are simple but inflexible. They can be used when you wish to step through the elements of the array in first-to-last order, and you do not need to know the index of the current element. In all other cases, the "standard" for loop should be preferred. Two additional statement types, break and continue , can also control the behavior of enhanced forloops.

ADVANCED

The break and continue statements can be used with statement labels

. For more information :-

https://blogs.oracle.com/CoreJavaTechTips/entry/using_enhanced_for_loops_with

The example code does not show a for-in loop as the enhanced loop, but good points about the for-in enhanced loop. — Nick Zimmerman, May 01 '14 at 18:57
may be becoz of formatting as i am not able to format well as im at mobile plz format this ans — Neel, May 01 '14 at 19:10
@artlessnoise there was not meant to be a specific language but just a general knowledge of which would be better or more "enhanced" — Zach Starnes, May 01 '14 at 19:22
@artlessnoise I did however add some languages that were more times used in the answers on this question. — Zach Starnes, May 01 '14 at 19:24

score -1 · Answer 5 · answered May 01 '14 at 18:46

-1

I am not sure that is what an "enhanced for loop" is. Whenever I hear enhanced for loop I think of something like this https://blogs.oracle.com/CoreJavaTechTips/entry/using_enhanced_for_loops_with.

answered May 01 '14 at 18:46

jaesanx

195
1
13

How can I enhance this for loop

5 Answers5