8

Given N elements, process only the first (0) and last (N-1) element.

But, if N = 1, only process the single element once.

Using a loop that runs once or twice, as appropriate, lets us avoid duplicating the loop body. If there's a readable way to do this, it has a benefit for source-code size. It may also have advantages for machine-code size, if the loop body is large, and the compiler doesn't end up duplicating it.


I tried incrementing by N-1 but it will not work when N=1 (loops forever). Are there tricks (reverse loop f.i) that will fix this?

for (i = 0 ; i < N ; i += (N - 1))

Edit:

My original problem concerns three nested loops in x,y,z direction, which is why I couldn't just process elem[0]) and elem[N-1]. Now I have the following

#define forEachLglBound(i_,j_,k_)                                   \
        for(Int i_ = 0;i_ < NPX;i_+=((NPX>1) ? (NPX-1) : 1))        \
            for(Int j_ = 0;j_ < NPY;j_+=((NPY>1) ? (NPY-1) : 1))    \
                for(Int k_ = 0;k_ < NPZ;k_+=((NPZ>1) ? (NPZ-1) : 1))

danny
  • 1,101
  • 1
  • 12
  • 34
  • Why do you need a loop for a constant two elements? – kaylum Nov 20 '15 at 00:28
  • why would you need a for loop if you only want to access the first and last element – R Nar Nov 20 '15 at 00:28
  • 6
    Why? Just write a function to do the work and call it with `array[0]` and `array[N-1]`. The intention will be much clearer. – clcto Nov 20 '15 at 00:29
  • I don't know what are you doing but if you need the first and last number, why you are using a loop? if you have an array then you can use something like: array[0] and array[size - 1], always check that the size is bigger than 1. – Simon Puente Nov 20 '15 at 00:32
  • I have three nested loops of that kind (x,y,z) directions, where N is an integer from 1 to 15 max. – danny Nov 20 '15 at 00:32
  • show your work, for better help – Simon Puente Nov 20 '15 at 00:34
  • That will do the work twice (i=0). I want to do it once when N=1. With what I have it loops forever. – danny Nov 20 '15 at 00:36

4 Answers4

7

How about the following line. Very close (textwise!) to the original solution.

for (i = 0 ; i < N ; i += (N > 1) ? N-1 : 1)
Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
balabhi
  • 641
  • 5
  • 5
  • The *ternary* operator has many unique uses. At first blush it may look like a rather odd bit of logic, but its inline conditional decision making extends what otherwise would not be possible in many useful ways. – David C. Rankin Nov 20 '15 at 03:58
  • This compiles to [fairly good code](http://goo.gl/OxdrQW), but isn't very easy to read, IMO. I think my answer is more readable. – Peter Cordes Nov 20 '15 at 05:53
4

If you can factor the loop body out into a function or macro, probably the most readable way is:

process( arr[0] );
if (N-1 != 0)
    process( arr[N-1] );

If that's not an option, or you think one of these loop structures is readable enough:

XOR can toggle a variable between 0 and the value you XOR with.

int i=0;
do {
    process(arr[i]);
} while(i ^= (N-1));  // xor to toggle, but stays zero if N-1 is zero

This compiles to better asm than any of the other options, as you can see on godbolt. I included the ideas from the other answers, including the simple if (which does very well when duplicating the operation is cheap).

The xor version keeps working very well in a triple-nested loop (see the while_xor_triple function at the bottom of the godbolt link).


Without using xor, I think this loop is moderately readable:

int i=0;
do {
    process( arr[i] );
} while( (i!=N-1) && (i=N-1) );

if i isn't already N-1, set it to that and redo the loop. This should make it easy for the compiler to do the test efficiently, without having to compute any expressions other than N-1. gcc still tests and branches on both operands of the && separately, though, but with less overhead than some of the other answers.


The 2nd while loop could compile to something like this (but unfortunately doesn't, because gcc doesn't do the trickery of my first two insns):

    xor   edx, edx
.repeat:
    mov   ecx, edx
      ... loop body using i (ecx)
    mov   edx, [N]
    dec   edx     ; edx = N-1
    cmp   ecx, edx
    jne .repeat

I thought of the XOR idea while thinking about how to do it in asm. IDK if a smart compiler could make this output from the while(i!=N-1 && i=N-1) version, but it's definitely better:

    xor   ecx, ecx
.repeat:
      ... loop body using i (ecx)
    mov   edx, [N]
    dec   edx        ; edx = N-1
    xor   ecx, edx   ; i ^= N-1
    jnz .repeat      ; jump if the last result wasn't zero

If you have N-1 stored in a register or memory, the mov/dec to compute it on the fly go away.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
  • Excellent answer. Of course, we should all remember the first rule of optimization: optimize algorithms, not code. Only rarely is it necessary to optimize code! PS: congrats on reaching 100K reputation! – Jeff Learman Aug 02 '18 at 23:22
  • 1
    @JeffLearman: Yes, algorithmic improvements can be huge, but efficient asm for an efficient algorithm sometimes requires some hand-holding for the compiler, and can be worth it for hot loops. see [Why is this C++ code faster than my hand-written assembly for testing the Collatz conjecture?](https://stackoverflow.com/q/40354978) and especially [What is the efficient way to count set bits at a position or lower?](https://stackoverflow.com/a/34410357) for an example of hand-holding the compiler into more efficient asm for the same thing. – Peter Cordes Aug 02 '18 at 23:26
  • 1
    Or especially when manually vectorizing with SIMD intrinsics, or even a loop that you want to auto-vectorize, thinking about / looking at the compiler's asm can give you significant speedups. And BTW, thanks, that 100k has been a long time coming, answering in tags that don't get as much attention as C++ or especially C# / Java. I'm also the only SO user to have a [tag:x86-64] gold badge, and was the first to get [tag:x86] gold :) – Peter Cordes Aug 02 '18 at 23:29
3
for ( i = 0, repeat = 0; repeat < 2 && repeat < N; repeat++, i = N-1 )

or

repeat = (N<2) ? N : 2;
for ( i = 0; repeat > 0; repeat--, i = N-1 )
user3386109
  • 34,287
  • 7
  • 49
  • 68
-1

How about this, though don't have to use a for loop maybe, but it's a simple way and easy to understand.

for (int first = 0, last = N - 1;;) {
     // the stuff use first
     if (first != last) {
     // the stuff use last
     }
     break;
}
luoluo
  • 5,353
  • 3
  • 30
  • 41