1

For example: I have a function consisting of a while loop (this one would check for primes)

function isprime(int number){
int i=2;
int max= (int) sqrt(number)+1;
   while(i<max){
       if(number%i==0){
          return 0;
       }
       i++;
   }
return 1;
}

I know this would be a very poor algorithm for testing primes, but my question focuses more on loops anyway. Currently a forth of the function's operations are "just" checking the condition. For larger numbers, this can be an awful lot.

(Quick Example 1: If "number" is 1009, that would be checking the while condition 30 times, 30 operations for the index, and 29*2 operations for the if condition. That's 118 operations)

I realize that I can just cut&paste within the while condition and having the the index pass the maximum, while resulting in additional operations, doesn't falsify the return value. So if I cut everything starting from "if..." to "i++;" and paste it three (or n) times, checking the while condition would only take up 1/10 (or 1/(1+3n) of the operations, while creating a maximum of +2*3(or +(n-1)*3) unnecessary operations.

(Quick Example 2: If "number" is 1009, that would mean checking the while condition 11 times, 33 operations for the index, and 33*2 operations for the if condition. That's 100 operations, 13 less)

As I am currently experimenting with very big numbers (in layman's terms: "the condition will be false for a very, very, very long time") so pasting the if condition and the increment thousands of times would be useful, but very impractical - so my question is:

Is there a tool (or a technique I am missing) to do that for me, but keeps the code clearly and easy to modify?

Thanks in advance!

  • 3
    Do you mean [Loop Unrolling](http://en.wikipedia.org/wiki/Loop_unrolling)? – Blastfurnace Nov 08 '14 at 17:50
  • @Blastfurnace With loop unrolling, the loop condition still needs to be evaluated upon each iteration, else the unrolled loop would semantically be different from the original. – The Paramagnetic Croissant Nov 08 '14 at 18:02
  • @Blastfurnace I'm referring to the fact that `while (i--) { foo(); }` cannot be unrolled to `while (i -= 4) { foo(); foo(); foo(); foo(); }`. I'm perfectly aware of how optimizing compilers perform loop unrolling. – The Paramagnetic Croissant Nov 08 '14 at 19:27
  • @TheParamagneticCroissant: It's still not clear what you were trying to convey in your first comment. You and I know about loop unrolling, I merely suggested to the OP a term he can research that might answer his question. – Blastfurnace Nov 08 '14 at 21:08
  • @Blastfurnace OP was looking for something that would enable him not to evaluate the condition of a loop upon each iteration. Loop unrolling is not the thing that can do that, it is all I was trying to say. – The Paramagnetic Croissant Nov 08 '14 at 21:09

2 Answers2

5

Your question is a bit unclear.

First, you could change slightly your algorithm; e.g. incrementing by 2, not by 1 (since every prime number above 2 is odd).

Then, when asked to optimize (e.g. with g++ -Wall -O2), a compiler generally do some loop unrolling, as if it "copied" (and constant-folded) a loop's body several times.

With GCC, you might help the optimizer by e.g. using __builtin_expect, e.g. with

#ifdef __GNUC__
#define MY_UNLIKELY(P) __builtin_expect((P),0)
#else
#define MY_UNLIKELY(P) (P)
#endif 

and then you'll code

 if(MY_UNLIKELY(number%i==0))
   return 0;

At last, manually optimizing your code might not worth the pain, you should benchmark. (The CPU cache is very important today, so unrolling too much might slow down the code, see also __builtin_prefetch in GCC and this).

You could also consider meta-programming and multi-staged programming facilities; in languages like Meta Ocaml or Common Lisp meta-programming means much more than C++11 template things. You could consider generating C code at runtime (then dlopening it), or using JIT compilation libraries like libjit (e.g. do partial evaluation). Read also J.Pitrat's blog about meta-combinatorial search, and wikipage on eval. My MELT system shows that these techniques can (painfully) be used in relation with C++ (MELT generates C++ code at runtime so do meta-programming this way).

Community
  • 1
  • 1
Basile Starynkevitch
  • 223,805
  • 18
  • 296
  • 547
  • First of all thank you for the tips regarding the function, thou it is only meant as an example (no good for test large primes anyway) – GregLSteiner Nov 09 '14 at 00:38
  • I didn't know the term "loop unrolling" and had a hard time finding what I was looking for without this keyword! Thank you very much for your answer and detailed descriptions of possible solutions. I have a long read ahead of me and some catching up to do! Thanks to your advice I now know what I am looking for! (sorry for double posting) – GregLSteiner Nov 09 '14 at 00:55
3

There is a weird snippet called Duff's Device which is applicable to the case where you know in advance how many iterations should occur.

So for example, let's suppose you have this loop:

for (size_t i = 0; i != max; ++i) {
    call(i);
}

With Duff's Device, you can transform it to only check every 4 iterations like so:

size_t i = 0;

switch (max % 4) {
case 0: do { call(i++);
case 3:      call(i++);
case 2:      call(i++);
case 1:      call(i++);
           } while (i != max);
}

It thus combines manual unrolling with compensating for the fact that the number of iterations might not be a multiple of the number of times you unroll. There are more explanations here.

Though personally, I would advise using a slightly clearer format:

// Adjust "max" to a multiple of N:
size_t const adjusted_max = (max + N - 1) / N * N;
size_t const leftover_max = max - adjusted_max;

for (size_t i = 0; i != adjusted_max; i += N) {
    call(i);
    call(i);
    // N times
}

switch (leftover_max) {
case N-1: call(adjusted_max + N - 1);
case N-2: call(adjusted_max + N -2);
...
case 1  : call(adjusted_max);
case 0  : ;
}

Note that you can process the leftover either before or after actually, it does not matter.


With that being said, Loop Unrolling is a basic compiler optimization; so before using such a weird implement, do check whether your compiler would not be already doing it for you...

Community
  • 1
  • 1
Matthieu M.
  • 287,565
  • 48
  • 449
  • 722
  • I highly advise against using strange code like Duff's device. First of all, the compiler is likely to do a better job optimizing your loop. Secondly, and more importantly, I believe in simplicity over cleverness. I'd rather someone understand my code at a glance rather than spend 15 minutes just to figure out that I unrolled a simple loop. In regards the OP's question, I'd leave my loops as is (unrolled) in my source. There have been many cases where I tried to "optimize" my code only to find out that I actually slowed it down because the compiler was already doing it better. – JS1 Nov 08 '14 at 23:09
  • 1
    @JS1: we agree, this is why I presented a simpler alternative and put a disclaimer at the end of the answer. There are some cases where the compiler cannot unroll because it conservatively assumes there are dependencies, or because it just did not match one of the expected patterns, but even then it is pretty rare that the comparison dominates the time of loop body itself... as with all matters of optimization, it boils down to "profile and experiment". – Matthieu M. Nov 09 '14 at 11:35