Performance optimization of for-loop / switch-statement

Question

please help me to identify which of these following is more optimized code?

for(int i=0;i<count;i++)
{
    switch(way)
    {
        case 1:
            doWork1(i);
            break;
        case 2:
            doWork2(i);
            break;
        case 3:
            doWork3(i);
            break;
    }
}

OR

switch(way)
{
    case 1:
        for(int i=0;i<count;i++)
        {
            doWork1(i);
        }
        break;
    case 2:
        for(int i=0;i<count;i++)
        {
            doWork2(i);
        }
        break;
    case 3:
        for(int i=0;i<count;i++)
        {
            doWork3(i);
        }
        break;
}

In the first case, there happens to be an overhead of always checking the switch case condition in every iteration. In second case, the overhead is not there. I feel the second case is much better. If anyone has any other workaround, please help me out in suggesting it.

try this: http://stackoverflow.com/questions/445067/if-vs-switch-speed — Shiham, Jun 26 '12 at 07:31
Why don't you [measure](http://msdn.microsoft.com/en-us/library/system.diagnostics.stopwatch.aspx) it? — Tim Schmelter, Jun 26 '12 at 07:31
@TimSchmelter my project is in silverlight, stopwatch is not available here. — vaibhav, Jun 26 '12 at 07:33
@vaibhav: Then use DateTime/TimeSpan: `long before = DateTime.Now.Ticks; doWork(); long after = DateTime.Now.Ticks; TimeSpan elapsedTime = new TimeSpan(after - before);` — Tim Schmelter, Jun 26 '12 at 07:38
Personally, I'd prefer the first example as it's a lot easier to read. Optimizing a switch statement seems like micro optimization to me. — fjdumont, Jun 26 '12 at 07:39
I wanted to know whether redundant code hampers performance or not — vaibhav, Jun 26 '12 at 07:40
@TimSchmelter, that seems to be a good option to check performance. will see through it, thanks :) — vaibhav, Jun 26 '12 at 07:42
the performance hit would probably be a `constant` negligle amount, if C# implements switch using jmp tables or something of that sort, the maintainability hit would far surpass that, also `Premature optimization is the root of all evil (or at least most of it) in programming` -- Donalth knuth :) — Samy Vilar, Jun 26 '12 at 07:54

score 6 · Accepted Answer · answered Jun 26 '12 at 07:41

A switch on low, contiguous values is insanely fast - this type of jump has highly optimised handling. Frankly, what you ask will make no difference whatsoever in the vast majority of cases - anything in doWork2(i); is going to swamp this; heck, the virtual-call itself might swamp it.

If it really, really, really matters (and I struggle to think of a real scenario here), then: measure it. In any scenario where it is noticeable, then only way to measure it will be with your actual, exact code - you can't generalise pico-optimisations.

So:

it doesn't matter
measure
it doesn't matter

Do you refer to the `switch` (IL) jump table only? If not, consider `way` being an expensive to evaluate property - is there any chance to compiler optimization except for the jump table? — fjdumont, Jun 26 '12 at 08:58

score 2 · Answer 2 · answered Jun 26 '12 at 07:31

You could do something like:

Func(void, int> doWork;
switch(way) 
{ 
    case 1: 
        doWork = doWork1; 
        break; 
    case 2: 
        doWork = doWork2; 
        break; 
    case 3: 
        doWork = doWork3; 
        break; 
} 
for (int i=0;i<count;i++)  
{
     doWork(i);
}

(Written in here, code might not quite compile, just to give you the idea...)

Bastardo · Answer 3 · 2013-02-25T15:38:33.517

I'd ask questions to myself for optimization

First of all, how big is count? Is it 1,2,10, 10000000000 ?
How powerful will the machine be that will be running the code?
Am I supposed to write less code ?
Is someone gonna read this code after I write it ? If so how professional is he ?
What do I lack of ? Time? Speed ? Something else ?
What is way ? Where do I get it from ? What are the probabilities of way being 1 or 2 or 3?

It is obvious that the first code snippet will go for the switch part until i reaches count but how big is count? If it is not a very big number it won't matter? If it is too big and you get very slow running time then it is useless. However, as I said if you want readibility and can guarantee that count is small why not use the first one? It is much more readible than the second one and there is less code which is something I like.

Second snippet, looks uggly but it should be preferred if count is a huge number.

Matthew Watson · Answer 4 · 2012-06-26T08:44:12.757

Actually, it can be somewhat faster despite some of the comments here.

Let's actually test it:

using System;
using System.Diagnostics;

namespace Demo
{
    class Program
    {
        static void Main(string[] args)
        {
            int count = 1000000000;

            Stopwatch sw = Stopwatch.StartNew();

            for (int way = 1; way <= 3; ++way)
                test1(count, way);

            var elapsed1 = sw.Elapsed;
            Console.WriteLine("test1() took " + elapsed1);

            sw.Restart();

            for (int way = 1; way <= 3; ++way)
                test2(count, way);

            var elapsed2 = sw.Elapsed;
            Console.WriteLine("test2() took " + elapsed2);

            Console.WriteLine("test2() was {0:f1} times as fast.", + ((double)elapsed1.Ticks)/elapsed2.Ticks);
        }

        static void test1(int count, int way)
        {
            for (int i = 0; i < count; ++i)
            {
                switch (way)
                {
                    case 1: doWork1(); break;
                    case 2: doWork2(); break;
                    case 3: doWork3(); break;
                }
            }
        }

        static void test2(int count, int way)
        {
            switch (way)
            {
                case 1:
                    for (int i = 0; i < count; ++i)
                        doWork1();
                    break;

                case 2:
                    for (int i = 0; i < count; ++i)
                        doWork2();
                    break;

                case 3:
                    for (int i = 0; i < count; ++i)
                        doWork3();
                    break;
            }
        }

        static void doWork1()
        {
        }

        static void doWork2()
        {
        }

        static void doWork3()
        {
        }
    }
}

Now this is quite unrealistic, since the doWork() methods don't do anything. However, it will give us a baseline timing.

The results I get for a RELEASE build on my Windows 7 x64 system are:

test1() took 00:00:03.8041522
test2() took 00:00:01.7916698
test2() was 2.1 times as fast.

So moving the loop into the switch statement makes it MORE THAN TWICE AS FAST.

Now let's make it a little bit more realistic by adding some code into doWork():

using System;
using System.Diagnostics;

namespace Demo
{
    class Program
    {
        static void Main(string[] args)
        {
            int count = 1000000000;

            Stopwatch sw = Stopwatch.StartNew();

            for (int way = 1; way <= 3; ++way)
                test1(count, way);

            var elapsed1 = sw.Elapsed;
            Console.WriteLine("test1() took " + elapsed1);

            sw.Restart();

            for (int way = 1; way <= 3; ++way)
                test2(count, way);

            var elapsed2 = sw.Elapsed;
            Console.WriteLine("test2() took " + elapsed2);

            Console.WriteLine("test2() was {0:f1} times as fast.", + ((double)elapsed1.Ticks)/elapsed2.Ticks);
        }

        static int test1(int count, int way)
        {
            int total1 = 0, total2 = 0, total3 = 0;

            for (int i = 0; i < count; ++i)
            {
                switch (way)
                {
                    case 1: doWork1(i, ref total1); break;
                    case 2: doWork2(i, ref total2); break;
                    case 3: doWork3(i, ref total3); break;
                }
            }

            return total1 + total2 + total3;
        }

        static int test2(int count, int way)
        {
            int total1 = 0, total2 = 0, total3 = 0;

            switch (way)
            {
                case 1:
                    for (int i = 0; i < count; ++i)
                        doWork1(i, ref total1);
                    break;

                case 2:
                    for (int i = 0; i < count; ++i)
                        doWork2(i, ref total2);
                    break;

                case 3:
                    for (int i = 0; i < count; ++i)
                        doWork3(i, ref total3);
                    break;
            }

            return total1 + total2 + total3;
        }

        static void doWork1(int n, ref int total)
        {
            total += n;
        }

        static void doWork2(int n, ref int total)
        {
            total += n;
        }

        static void doWork3(int n, ref int total)
        {
            total += n;
        }
    }
}

Now I get these results:

test1() took 00:00:03.9153776
test2() took 00:00:05.3220507
test2() was 0.7 times as fast.

Now it's SLOWER to put the loop into the switch! This counterintuitive result is typical of these kinds of things, and demonstrates why you should ALWAY perform timing tests when you are trying to optimise code. (And optimising code like this is usually something you shouldn't even do unless you have good reasons to suspect that there is a bottleneck. You'd be better off spending your time cleaning up your code. ;))

I did some other tests, and for slightly simpler doWork() methods, the test2() method was quicker. It really greatly depends on what the JIT compiler can do with the optimisations.

NOTE: I think that the reason for the differences in speed for my second test code is because the JIT compiler can optimise out the 'ref' calls when inlining the calls to doWork() when they are not in a loop as in test1(); whereas for test2() it cannot (for some reason).

Tim Schmelter · Answer 5 · 2012-06-26T08:02:16.527

You should measure it to see whether it's worth to optimize or not(I'm very sure that it's not). Personally i prefer the first for readability and conciseness(less code, less prone to errors, more "dry").

Here's another approach which is even more concise:

for(int i = 0; i < count; i++)
{
    doAllWays(way, i); // let the method decide what to do next
}

All "ways" seem to be releated, otherwise they wouldn't appear in the same switch. Hence it makes sense to bundle them in one method first which does the switch.

score 0 · Answer 6 · answered Jun 26 '12 at 07:34

0

The second method is more efficient; you have to complete the full for loop regardless. But in the first method, you're needlessly repeating the case statement count times.

answered Jun 26 '12 at 07:34

McGarnagle

101,349
31
229
260

I feel that too, its just that if there was much more optimization that could have happened in second code, it would have been better :) – vaibhav Jun 26 '12 at 07:36
@vaibhav do you mean *performance* optimization, or reducing redundancy in your code (ie, the DRY principle)? I don't see how it could be more efficient, at least with the info you've provided. – McGarnagle Jun 26 '12 at 07:37
reducing redundancy for second code, for better readability, and I wanted to know whether redundant code hampers performance or not – vaibhav Jun 26 '12 at 07:40
@vaibhav per MG's answer, readability should be the only real concern here... but I think the code, as far as you've presented it, is fine for readability. If there were 20 different *doWork* methods, or if the *doWork* methods were redundant, then it's a different story. – McGarnagle Jun 26 '12 at 08:04

score 0 · Answer 7 · answered Jun 26 '12 at 07:53

assuming you have a performance issue here (as switch is really, really fast in most cases):

If you are bothered about your switch statement, i suggest applying refactoring here.

The switch can easily be replaced by a Strategy Pattern (since the switched value is not changed in the for loops, it is not necessary to switch at all).

The real optimization target are those for loops, but without context it is hard to tell what can be done about that.

Here is some more information on refactoring switches (e.g. to Strategy pattern) CodeProject Article on refactoring switch

Performance optimization of for-loop / switch-statement

7 Answers7