In a switch vs dictionary for a value of Func, which is faster and why?

Question

Suppose there is the following code:

private static int DoSwitch(string arg)
{
    switch (arg)
    {
        case "a": return 0;
        case "b": return 1;
        case "c": return 2;
        case "d": return 3;
    }
    return -1;
}

private static Dictionary<string, Func<int>> dict = new Dictionary<string, Func<int>>
    {
        {"a", () => 0 },
        {"b", () => 1 },
        {"c", () => 2 },
        {"d", () => 3 },
    };

private static int DoDictionary(string arg)
{
    return dict[arg]();
}

By iterating over both methods and comparing, I am getting that the dictionary is slightly faster, even when "a", "b", "c", "d" is expanded to include more keys. Why is this so?

Does this have to do with cyclomatic complexity? Is is because the jitter compiles the return statements in the dictionary to native code only once? Is it because the lookup of the dictionary is O(1), which may not be the case for a switch statement? (These are just guesses)

The compiler may actually turn your `switch` into a dictionary at some point, so make sure you test this with the real input. — Brian Rasmussen, Jul 23 '12 at 17:05
When examining the IL code, it seems that the only cached anonymous delegates are for the dictionary, not for the switch statements. — cubetwo1729, Jul 23 '12 at 17:08
I want to make this more general, since the `Func`s will change, but I wanted to see a dictionary value of `Func` in its most basic case. — cubetwo1729, Jul 23 '12 at 17:11
@cubetwo1729 The compiler generates code differently based on the actual switch/case. — Brian Rasmussen, Jul 23 '12 at 17:12
@Brian Rasmussen: Do you mean the compiler or the interpreter? From what I see, the generated IL code is for the switch statement. However, I do not know about what the CLR converts this to at runtime. — cubetwo1729, Jul 23 '12 at 17:18
The C# compiler generates different IL code depending on how large the switch/case code is. — Brian Rasmussen, Jul 23 '12 at 17:40
As the number of cases in the switch statement grows, the code generated by the compiler may change as the compiler switches tactics to take advantage of observed patterns in the source code. I don't know offhand if the C# or JIT compilers perform this particular optimization, but a common compiler trick for switch statements when the case selectors are mostly sequential is to compute a jump vector. This executes in constant time. Subtract arg - "a", use result as index into jump table to jump to appropriate case block. — dthorpe, Jul 23 '12 at 17:49
[Premature optimization is the root of all evil.](https://softwareengineering.stackexchange.com/a/80092) — Liam, Sep 09 '21 at 08:24

KeithS · Accepted Answer · 2012-07-23T18:20:43.303

The short answer is that the switch statement executes linearly, while the dictionary executes logarithmically.

At the IL level, a small switch statement is usually implemented as a series of if-elseif statements comparing equality of the switched variable and each case. So, this statement will execute in a time linearly proportional to the number of valid options for myVar; the cases will be compared in the order they appear, and the worst-case scenario is that all the comparisons are tried and either the last one matches or none do. So, with 32 options, the worst case is that it's none of them, and the code will have made 32 comparisons to determine this.

A Dictionary, on the other hand, uses an index-optimized collection to store values. In .NET, a Dictionary is based on a Hashtable, which has effectively constant access time (the disadvantage being extremely poor space efficiency). Other options commonly used for "mapping" collections like Dictionaries include balanced tree structures like red-black trees, which provide logarithmic access (and linear space efficiency). Any of these will allow the code to find the key corresponding to the proper "case" in the collection (or determine it does not exist) much faster than a switch statement can do the same.

EDIT: Other answers and commenters have touched on this, so in the interest of completeness I will as well. The Microsoft compiler does not always compile a switch to an if/elseif as I inferred originally. It typically does so with small numbers of cases, and/or with "sparse" cases (non-incremental values, like 1, 200, 4000). With larger sets of adjacent cases, the compiler will convert the switch into a "jump table" using a CIL statement. With large sets of sparse cases, the compiler can implement a binary search to narrow the field, and then "fall through" a small number of sparse cases or implement a jump table for adjacent cases.

However, the compiler will typically choose the implementation that is the best compromise of performance and space efficiency, so it will only use a jump table for a large number of densely-packed cases. This is because a jump table requires a space in memory on the order of the range of cases it must cover, which for sparse cases is terribly inefficient memory-wise. By using a Dictionary in source code, you basically force the compiler's hand; it will do it your way, instead of compromising on performance to gain memory efficiency.

So, I would expect most cases in which either a switch statement or a Dictionary could be used in source to perform better when using a Dictionary. Large numbers of cases in switch statements are to be avoided anyway, as they are less maintainable.

If a dictionary is implemented with a hashtable, then the look-up time is O(1). It will still be relatively slower than a compile time *switch* though. — oleksii, Jul 23 '12 at 17:33
@KeithS: How do you know that the .NET dictionary uses a red-black binary tree? — cubetwo1729, Jul 23 '12 at 17:33
A .NET Dictionary actually uses a hashtable; I have edited accordingly. The Java equivalent, Map, is implemented three different ways, including with a RBT. — KeithS, Jul 23 '12 at 17:40
And a switch statement *may* be implemented as a jump table, it is not required to be. — Ed S., Jul 23 '12 at 17:53
According to related questions, there are three ways the compiler can compile a switch; if/elseif, binary search, or jump table. The compiler will choose which one to use (or which combination) based on the number of cases and their adjacency. So, it may fall through linearly, use only a CIL, or it may use a binary search coupled with either. — KeithS, Jul 23 '12 at 18:05

Brian Rasmussen · Answer 2 · 2012-07-24T16:15:51.680

This is a good example of why micro-benchmarks can be misleading. The C# compiler generates different IL depending on the size of the switch/case. So switching on a string like this

switch (text) 
{
     case "a": Console.WriteLine("A"); break;
     case "b": Console.WriteLine("B"); break;
     case "c": Console.WriteLine("C"); break;
     case "d": Console.WriteLine("D"); break;
     default: Console.WriteLine("def"); break;
}

produce IL that essentially does the following for each case:

L_0009: ldloc.1 
L_000a: ldstr "a"
L_000f: call bool [mscorlib]System.String::op_Equality(string, string)
L_0014: brtrue.s L_003f

and later

L_003f: ldstr "A"
L_0044: call void [mscorlib]System.Console::WriteLine(string)
L_0049: ret

I.e. it's a series of comparisons. So run time is linear.

However, adding additional cases, e.g. to include all the letters from a-z, changes the IL generated to something like this for each:

L_0020: ldstr "a"
L_0025: ldc.i4.0 
L_0026: call instance void [mscorlib]System.Collections.Generic.Dictionary`2<string, int32>::Add(!0, !1)

and

L_0176: ldloc.1 
L_0177: ldloca.s CS$0$0001
L_0179: call instance bool [mscorlib]System.Collections.Generic.Dictionary`2<string, int32>::TryGetValue(!0, !1&)
L_017e: brfalse L_0314

and finally

L_01f6: ldstr "A"
L_01fb: call void [mscorlib]System.Console::WriteLine(string)
L_0200: ret

I.e. it now uses a dictionary instead of a series of string compares, and thus gets the performance of a dictionary.

In other words the IL code generated for these are different and this is just at the IL level. The JIT compiler may optimize further.

TL;DR: So the morale of the story is to look at real data and profile instead of trying to optimize based on micro-benchmarks.

VS2015+ doesn't do this anymore – Daisha Lynn Jan 17 '18 at 04:12 — Daisha Lynn, Jan 17 '18 at 04:12

score 5 · Answer 3 · answered Jul 23 '12 at 18:26

As with many questions involving compiler codegen decisions, the answer is "it depends".

Building your own hash table will probably run faster than compiler generated code in many cases because the compiler has other cost metrics it is trying to balance that you are not: primarily, memory consumption.

A hash table will use more memory than a handful of if-then-else IL instructions. If the compiler spit out a hash table for every switch statement in a program, memory use would explode.

As the number of case blocks in the switch statement grows, you will probably see the compiler produce different code. With more cases, there is greater justification for the compiler to abandon small and simple if-then-else patterns in favor of faster but fatter alternatives.

I don't know offhand if the C# or JIT compilers perform this particular optimization, but a common compiler trick for switch statements when the case selectors are many and mostly sequential is to compute a jump vector. This requires more memory (in the form of compiler generated jump tables embedded in the code stream) but executes in constant time. Subtract arg - "a", use result as index into jump table to jump to appropriate case block. Boom, yer done, regardless of whether there are 20 or 2000 cases.

A compiler is more likely to shift into jump table mode when the switch selector type is char or int or enum and the values of the case selectors are mostly sequential ("dense"), since these types can be easily subtracted to create an offset or index. String selectors are a little harder.

String selectors are "interned" by the C# compiler, meaning the compiler adds the string selectors values to an internal pool of unique strings. The address or token of an interned string can be used as its identity, allowing for int-like optimizations when comparing intern'd strings for identity / byte-wise equality. With sufficient case selectors, the C# compiler will produce IL code that looks up the intern'd equivalent of the arg string (hash table lookup), then compares (or jump tables) the interned token with the precomputed case selector tokens.

If you can coax the compiler into producing jump-table code in the char/int/enum selector case, this can execute faster than using your own hash table.

For the string selector case, the IL code still has to do a hash lookup so any performance difference from using your own hash table is probably a wash.

In general, though, you shouldn't dwell on these compiler nuances too much when writing application code. Switch statements are generally much easier to read and understand than a hash table of function pointers. Switch statements that are big enough to push the compiler into jump table mode are often too big to be humanly readable.

If you find a switch statement is in a performance hotspot of your code, and you have measured with a profiler that it has tangible performance impact, then changing your code to use your own dictionary is a reasonable tradeoff for a performance gain.

Writing your code to use a hash table from the start, with no performance measurements to justify this choice, is over-engineering that will lead to unfathomable code with an unnecessarily higher maintenance cost. Keep It Simple.

score 3 · Answer 4 · answered Jul 23 '12 at 17:58

By default, a switch on a string is implemented like an if / else / if / else construct. As suggested by Brian, the compiler will convert the switch to a hashtable when it gets bigger. Bart de Smet shows this in this channel9 video, (switch is discussed at 13:50)

The compiler is not doing it for 4 items because it is being conservative, to prevent that the cost of the optimization outweigh the benefits. Building the hashtable costs a little time and memory.

In a switch vs dictionary for a value of Func, which is faster and why?

4 Answers4

Linked

Related