0
if (a == 1)
    //do something
else if (a == 2)
    //do something
else if (a == 3)
    //do something
else if (a == 4)
    //do something
else if (a == 5)
    //do something
else if (a == 6)
    //do something
else if (a == 7)
    //do something
else if (a == 8)
    //do something    

Now imagine, we know that a will mostly be 7 and we execute this block of code several times in a program. Will moving the (a == 7 ) check to top improve any time performance? That is:

if (a == 7)
     //do something
else if (a == 1)
    //do something
else if (a == 2)
    //do something
else if (a == 3)
    //do something

and so on. Does it improve anything or it's just wishful thinking?

Iharob Al Asimi
  • 52,653
  • 6
  • 59
  • 97
Sahil
  • 359
  • 2
  • 5
  • 13
  • 2
    You could always measure and find out. – NathanOliver Nov 16 '15 at 18:01
  • 4
    For something like that a `switch` statement might be better. – Some programmer dude Nov 16 '15 at 18:05
  • 1
    Use switch statement instead – M.kazem Akhgary Nov 16 '15 at 18:05
  • 2
    No, moving to top won't improve anything in the code emitted at compile time. You may want to give [branch prediction hints](http://stackoverflow.com/questions/30130930/is-there-a-compiler-hint-for-gcc-to-force-branch-prediction-to-always-go-a-certa) to get better optimized code. – πάντα ῥεῖ Nov 16 '15 at 18:07
  • Define "several times". More than a dozen? More than a hundred? – John Bode Nov 16 '15 at 18:09
  • 1
    Please choose one of C and C++. Also, at least clang recognizes this kind of construct and turns it into a fast jump table, regardless of how you order the conditions. – fuz Nov 16 '15 at 18:15
  • 1
    I would not optimize a problem having a complexity of N (cases) which is already the best 'algorithm', unless I am desperate. –  Nov 16 '15 at 18:16
  • I think that OPs Question was about run-time and not compile-time. So yes at Run-Time, if you check if a==7, then the other ELSE IFs won't matter anymore, because in the first IF the condition was met. – Michi Nov 16 '15 at 19:31
  • @Michi that isn't necessarily true as the compiler may re-order the tests or use a different structure entirely – M.M Nov 16 '15 at 23:58

7 Answers7

3

You can use switch case for improving the performance of program

switch (a) 
{
    case 1:
    break;    

    case 2:
    break;    

    case 3:
    break;    

    case 4:
    break;    

    case 5:
    break;    

    case 6:
    break;    

    case 7:
    break;    
}
Iharob Al Asimi
  • 52,653
  • 6
  • 59
  • 97
Sumit Marwha
  • 303
  • 5
  • 15
  • 4
    That would work in the simple cases (as OPs example), but the example may have been just a simplification of the actual use case (without revealing actual code), which may not be convertible to `switch` `case` construct. And question wasn't about "how can I make `if`-`else` construct faster?", but rather about "can I gain some performance if I move the clauses a bit in a way that it aligns with a typical use case?" – Algirdas Preidžius Nov 16 '15 at 18:11
  • 1
    @AlgirdasPreidžius i believe thats XY problem. This directs OP into right direction – M.kazem Akhgary Nov 16 '15 at 18:15
  • Can you please explain how it `switch` case would improve the performance? – Vallabh Patade Nov 16 '15 at 18:16
  • The question was *... Does it improve anything or it's just wishful thinking?...* How does this answer the question? – Haris Nov 16 '15 at 18:16
  • Suppose a=7 then u don't need to check all the conditions and flow will be gone directly to 7 condition by switch so it will definitely improve performance as compare to if else conditions – Sumit Marwha Nov 16 '15 at 18:22
  • Every case will be checked and then it will reach to `7`. – Haris Nov 16 '15 at 18:24
  • the `switch` construct is usually implemented using a jump table, so it takes the same time to get to any of the case blocks. The if-else construct has to check each value in turn, so that the last item takes the most machine cycles to reach. – FredK Nov 16 '15 at 18:36
  • @FredK, AFAIK That will happen only happen if the cases are consecutive. And even if it doesn't, you did not justify how this answered the OP's question. – Haris Nov 16 '15 at 18:47
2

Since the if conditions are checked in the order specified, yes. Whether it is measurable (and, hence, should you care) will depend on how many times that portion of the code is called.

Anon Mail
  • 4,660
  • 1
  • 18
  • 21
  • 1
    If the compiler can prove that the evaluations of the conditions don't have side effects, it might well reorder the conditions, as per the as-if-rule. – EOF Nov 16 '15 at 18:12
  • I'll take your word on that point. But in the specific example posted, how would the compiler optimize given that a is not a constant. – Anon Mail Nov 16 '15 at 18:14
  • 1
    @AnonMail For example, clang either generates a binary decision tree or a jump table for cascaded `if` statements. – fuz Nov 16 '15 at 18:17
  • Unless `a` is `volatile`, checking `if (a == 1)` *cannot* have side-effects. Thus, whether you first check `if (a == 1)` or `if (a == 2)` makes no difference in the observable behavior of the program. – EOF Nov 16 '15 at 18:17
  • @EOF I will take your word on that also. But, given the specific code, what would cause the compiler to change the order of the conditions? In short, how can it know to optimize in this specific case? – Anon Mail Nov 16 '15 at 18:21
  • 2
    As @FUZxxl and others have alluded to, the compiler might start with `if (a > 4)` and continue cutting the possibilities in half on every `if`, to reduce the maximum depth of the worst case (the one where, in the original code, you would have to do *all* of the checks). – EOF Nov 16 '15 at 18:24
  • @EOF I was not aware of this. That's interesting. Thanks for the enlightenment. – Anon Mail Nov 16 '15 at 18:29
1

Imagine you go to a hotel and are given a room with number 7.

You have to go across the hall checking every room until you find the room with number 7.

Will the time taken depend on how many rooms you checked before you got the one you have been alloted?

Yes..


But know this, in your scenario the time difference will be very very minute to be noticed.

For scenarios where there are too many numbers to be checked, putting the one in the beginning which occur many number of times does improve performance. In fact, this methodology is used by some network protocols for comparing protocol numbers

Haris
  • 12,120
  • 6
  • 43
  • 70
1

There is some penalty to be paid in case of compiler couldn't make the constructs to a jump table, I would think the switch/case implementation will be compiled as a jump table in assembly and if else not as a jump table then switch/case has an edge over if else. Again I guess this depends on architecture and compilers.

In case of switch/case compiler will be able to generate the asm jump table only based on the constants (eg. consecutive values) that we provide.

The test i ran on my machine gave the assembly for if/else as this (not jump table) ,

main: .LFB0:

    .cfi_startproc
    pushq   %rbp
    .cfi_def_cfa_offset 16
    .cfi_offset 6, -16
    movq    %rsp, %rbp
    .cfi_def_cfa_register 6
    subq    $16, %rsp
    movl    $7, -4(%rbp)
    cmpl    $1, -4(%rbp)
    jne     .L2
    movl    $97, %edi
    call    putchar
    jmp     .L3
.L2:

    **cmpl    $2, -4(%rbp)
    jne     .L4**
    movl    $97, %edi
    call    putchar
    jmp     .L3
.L4:

    **cmpl    $3, -4(%rbp)
    jne     .L5**
    movl    $97, %edi
    call    putchar
    jmp     .L3
.L5:

    **cmpl    $4, -4(%rbp)
    jne     .L6**
    movl    $97, %edi
    call    putchar
    jmp     .L3
.L6:

    **cmpl    $5, -4(%rbp)
    jne     .L7**
    movl    $97, %edi
    call    putchar
    jmp     .L3
.L7:

    **cmpl    $6, -4(%rbp)
    jne     .L8**
    movl    $97, %edi
    call    putchar
    jmp     .L3
.L8:

    cmpl    $7, -4(%rbp)
    jne     .L9
    movl    $97, %edi
    call    putchar
    jmp     .L3
.L9:

    cmpl    $8, -4(%rbp)
    jne     .L3
    movl    $97, %edi
    call    putchar
.L3:

    movl    $0, %eax
    leave
    .cfi_def_cfa 7, 8
    ret
    .cfi_endproc

But for switch/case (jump table), main: .LFB0:

    .cfi_startproc
    pushq   %rbp
    .cfi_def_cfa_offset 16
    .cfi_offset 6, -16
    movq    %rsp, %rbp
    .cfi_def_cfa_register 6
    subq    $16, %rsp
    movl    $7, -4(%rbp)
    cmpl    $7, -4(%rbp)
    ja      .L2
    movl    -4(%rbp), %eax
    movq    .L4(,%rax,8), %rax
    jmp     *%rax
    .section        .rodata
    .align 8
    .align 4
.L4:

    .quad   .L2
    .quad   .L3
    .quad   .L5
    .quad   .L6
    .quad   .L7
    .quad   .L8
    .quad   .L9
    .quad   .L10
    .text
.L3:

    movl    $97, %edi
    call    putchar
    jmp     .L2
.L5:

    movl    $97, %edi
    call    putchar
    jmp     .L2
.L6:

    movl    $97, %edi
    call    putchar
    jmp     .L2
.L7:

    movl    $97, %edi
    call    putchar
    jmp     .L2
.L8:

    movl    $97, %edi
    call    putchar
    jmp     .L2
.L9:

    movl    $97, %edi
    call    putchar
    jmp     .L2
.L10:

    movl    $97, %edi
    call    putchar
    nop
.L2:

    movl    $0, %eax
    leave
    .cfi_def_cfa 7, 8
    ret
    .cfi_endproc

From the tests i feel that switch/case is better as it doesn't have to go through the earlier entries to find a match.

I would suggest to try gcc -S option to generate the assembly to check the asm to see.

EOF
  • 6,273
  • 2
  • 26
  • 50
Randolf
  • 125
  • 1
  • 9
1

TL;DR version

For so few values, any differences in speed will be immeasurably small, and you'd be better off sticking with the more straightforward, easier-to-understand version. It isn't until you need to start searching through tables containing thousands to millions of entries that you'll want something smarter than a linear ordered search.

James Michener Version

Another possibility not yet mentioned is to do a partitioned search, like so:

if ( a > 4 )
{
  if ( a > 6 )
  {
    if ( a == 7 ) // do stuff
    else // a == 8, do stuff
  }
  else
  {
    if ( a == 5 ) // do stuff
    else // a == 6, do stuff
  }
}
else
{
  if ( a > 2 )
  {
    if ( a == 3 ) // do stuff
    else // a == 4, do stuff
  }
  else
  {
    if ( a == 1 ) // do stuff
    else // a == 2, do stuff
  }
}

No more than three tests are performed for any value of a. Of course, no less than three tests are performed for any value of a, either. On average, it should give better performance than the naive 1-8 search when the majority of inputs are 7, but...

As with all things performance-related, the rule is measure, don't guess. Code up different versions, profile them, analyze the results. For testing against so few values, it's going to be hard to get reliable numbers; you'll need to execute each method thousands of times for a given value just to get a useful non-zero time measurement (it also means that any difference between the methods will be ridiculously small).

Stuff like this can also be affected by compiler optimization settings. You'll want to build at different optimization levels and re-run your tests.

Just for giggles, I coded up my own version measuring several different approaches:

  • naive - the straightforward test from 1 to 8 in order;
  • sevenfirst - check for 7 first, then 1 - 6 and 8;
  • eightfirst - check from 8 to 1 in reverse order;
  • partitioned - use the partitioned search above;
  • switcher - use a switch statement instead of if-else;

I used the following test harness:

int main( void )
{
  size_t counter[9] = {0};
  struct timeval start, end;
  unsigned long total_nsec;

  void (*funcs[])(int, size_t *) = { naive, sevenfirst, eightfirst, partitioned, switcher };

  srand(time(NULL));
  printf("%15s %15s %15s %15s %15s %15s\n", "test #", "naive", "sevenfirst", "eightfirst", "partitioned", "switcher" );
  printf("%15s %15s %15s %15s %15s %15s\n", "------", "-----", "----------", "----------", "-----------", "--------" );

  unsigned long times[5] = {0};

  for ( size_t t = 0; t < 20; t++ )
  {
    printf( "%15zu ", t );
    for ( size_t f = 0; f < 5; f ++ )
    {
      total_nsec = 0;
      for ( size_t i = 0; i < 1000; i++ )
      {
        int a = generate();
        gettimeofday( &start, NULL );
        for ( size_t j = 0; j < 10000; j++ )
          (*funcs[f])( a, counter );
        gettimeofday( &end, NULL );
      }
      total_nsec += end.tv_usec - start.tv_usec;
      printf( "%15lu ", total_nsec );
      times[f] += total_nsec;
      memset( counter, 0, sizeof counter );
    }
    putchar('\n');
  }

  putchar ('\n');
  printf( "%15s ", "average:" );
  for ( size_t i = 0; i < 5; i++ )
    printf( "%15f ", (double) times[i] / 20 );
  putchar ('\n' );
  return 0;
}

The generate function produces random numbers from 1 through 8, weighted so that 7 appears half the time. I run each method 10000 times per generated value to get measurable times, for 1000 generated values.

I didn't want the performance difference between the various control structures to get swamped by the // do stuff code, so each case just increments a counter, such as

if ( a == 1 )
  counter[1]++;

This also gave me a way to verify that my number generator was working properly.

I run through the whole sequence 20 times and average the results. Even so, the numbers can vary a bit from run to run, so don't trust them too deeply. If nothing else, they show that changes at this level don't result in huge improvements. For example:

     test #           naive      sevenfirst      eightfirst     partitioned        switcher
     ------           -----      ----------      ----------     -----------        --------
          0             121             100             118             119             111
          1             110             100             131             120             115
          2             110             100             125             121             111
          3             115             125             117             105             110
          4             120             116             125             110             115
          5             129             100             110             106             116
          6             115             176             105             106             115
          7             111             100             111             106             110
          8             139             100             106             111             116
          9             125             100             136             106             111
         10             106             100             105             106             111
         11             126             112             135             105             115
         12             116             120             135             110             115
         13             120             105             106             111             115
         14             120             105             105             106             110
         15             100             131             106             118             115
         16             106             113             116             111             110
         17             106             105             105             118             111
         18             121             113             103             106             115
         19             130             101             105             105             116

   average:      117.300000      111.100000      115.250000      110.300000      113.150000

Numbers are in microseconds. The code was built using gcc 4.1.2 with no optimization running on a SLES 10 system1.

So, running each method 10000 times for 1000 values, averaged over 20 runs, gives a total variation of about 7 μsec. That's really not worth getting exercised over. For something that's only searching among 8 distinct values and isn't going to run more than "several times", you're not going to see any measurable improvement in performance regardless of the method used. Stick with the method that's the easiest to read and understand.

Now, for searching a table containing several hundreds to thousands to millions of entries, you definitely want to use something smarter than a linear search.


1. It should go without saying, but the results above are only valid for this code built with this specific compiler and running on this specific system.
John Bode
  • 119,563
  • 19
  • 122
  • 198
0

There should be a slight difference, but this would depend on the platform and the nature of the comparison - different compilers may optimise something like this differently, different architectures will have different effects as well, and it also depends on what the comparison actually is (if it is a more complex comparison than a simple primitive type comparison, for example)

It is probably good practice to test the specific case you are actually going to use, IF this is something that is likely to actually be a performance bottleneck.

Alternatively, a switch statement, if usable, should have the same performance for any value independent of order because it is implemented using an offset in memory as opposed to successive comparisons.

davedavedave
  • 487
  • 6
  • 11
  • No platform can optimize the order of the if conditions. By definition, they have to occur in the order specified by the programmer. – Anon Mail Nov 16 '15 at 18:10
  • @AnonMail No. That's not correct. They only have to be placed in an order that is not externally (i.e. from their behaviour) distinguishable from the order specified by the programmer. The compiler can of course rearrange the conditions as long as doing so does not cause any observable side effects. – fuz Nov 16 '15 at 18:16
  • @FUZxxl I'll take your word on that. There was a similar comment on my posted answer. But in this specific case I don't see how any rearranging can occur since the variable a is not a constant. – Anon Mail Nov 16 '15 at 18:18
  • @AnonMail Why does `a` have to be a constant for this to work? – fuz Nov 16 '15 at 18:26
-5

Probably not, you still have the same number of conditions and still either of them might evaluate to true, even if you check for a == 7 first, all other conditions are potentially true therefore will be evaluated.

The block of code that eis executed if a == 7 might be executed quicker when the program runs - but essentially you're code is still the same, with the same number of statements.

Iharob Al Asimi
  • 52,653
  • 6
  • 59
  • 97
Nunchy
  • 948
  • 5
  • 11
  • 2
    This is an if/else construct. The code will stop checking as soon as a match occurs. – Anon Mail Nov 16 '15 at 18:08
  • Well it's an if/else if - more than one of them could potentially be true...if I've got if (a == 7) i mgiht also have if (a <= 10) for example...both are true. – Nunchy Nov 16 '15 at 18:10
  • Both are true. But read first comment again. If you go inside `if` the else part and if else part will be skipped no matter if they can be true or not. – M.kazem Akhgary Nov 16 '15 at 18:12
  • @Nunchy one or more could be but in an `if...else if` you would never move on to the else part which stops you from checking any of the other `if`s – NathanOliver Nov 16 '15 at 18:12
  • @Nunchy That's true but the OP posted specific code. In his specific case my statement is true. For the general case, you are, of course, correct. – Anon Mail Nov 16 '15 at 18:12