How is "switch" compiled?

Question

The switch opcode in CIL is quite limited compared to what C# offers. It accepts a single jump table containing a sequence of labels to where to jump if the argument is equal to the index of the label. So, unlike C#, you can only switch on a non-negative integer, and only for all cases from 0 to n.

On the other hand, C# switch can be used on strings and even negative numbers.

I have done some tests, and for strings, it seems a simple == equality is employed, so despite a popular belief, switch is not faster than if/else if in that case. == calls Equals, which does ordinal comparison (i.e. bytewise).

Also if seems that if the cases are "sequential enough", it is compiled to a real switch opcode. The compiler is even so clever it finds the minimum of the cases and subtracts it from the value, effectively making the cases start from 0.

If there are some cases outside the sequential range, it turns them to normal comparisons, but keeps the switch. If there are gaps in the switch, it makes them point to the next instruction (default:).

So I wonder, what is the full set of rules the compiler considers when compiling the switch statement in C#? When does it decide to turn it to the switch opcode, and when is it transformed only to normal comparisons?

Edit: Looks like for large amount of strings in switch, it caches them in a static Dictionary<string, int>.

`so despite a popular belief` I've never run into that before - is there a link you can share for that? — mjwills, Aug 02 '17 at 13:02
@mjwills I haven't done any research on the popularity of that belief, but I remember being sometimes told to use `switch` because of that reason. Of course `switch` is better to use nevertheless because it's clean and unrepetitive. — IS4, Aug 02 '17 at 13:05
The last time I actually heard someone express that belief was when I was back in high school (20 years ago), learning C++. At this point, I just assume it's cargo-cult programming. — Bradley Uffner, Aug 02 '17 at 13:20
A [Google search](https://www.google.com/search?q=switch+vs+if+performance&oq=switch+vs+if+performance&gs_l=psy-ab.3..0l4.153382.153382.0.153640.1.1.0.0.0.0.137.137.0j1.1.0....0...1.1.64.psy-ab..0.1.135.UuALbuYzvic) seems to show a lot of people trying to dispel, or performance profile this belief, so it must have been popular at one point. — Bradley Uffner, Aug 02 '17 at 13:27
@BradleyUffner: I don't think C++ would let you `switch` on a string 20 years ago. — Jacob Krall, Aug 02 '17 at 14:58
@JacobKrall That very well might be true. I was speaking in general terms about `if` vs `switch`, not just strings through. — Bradley Uffner, Aug 02 '17 at 14:59
What compiler are we talking about? For Roslyn, the relevant source is [here](https://github.com/dotnet/roslyn/blob/master/src/Compilers/CSharp/Portable/CodeGen/EmitStatement.cs#L1058) and also [here](https://github.com/dotnet/roslyn/blob/master/src/Compilers/CSharp/Portable/Lowering/LocalRewriter/LocalRewriter_SwitchStatement.cs). Reverse engineering the logic is left as an exercise to the reader, with the caveat that, obviously, this can change between versions. (The source includes comments about the heuristics used.) — Jeroen Mostert, Aug 02 '17 at 15:18
Describing the exact implementation details of the C# compiler (even if you narrowed the question to a single version) is too broad, and pointless because those details could change (and indeed, given new `switch` features, almost certainly have). The only requirement is the _semantics_, described in the language specification. That said, your test was apparently not very thorough, because the compiler uses a variety of strategies to implement `switch`, depending on the data type and values used. See marked duplicate for some discussion of that. — Peter Duniho, Aug 02 '17 at 16:44
@PeterDuniho Nobody is saying I am relying on implementation details, but the implementation is still useful to someone trying to emit custom CIL code and use a switch. I thought, for example, that the `switch` instruction was somehow used for all types and cases (silly, I know), and seeing "how C# does that" is surely helpful to CIL programmers. In the linked question (about the speed, not the implementation), there is nothing about the Dictionary, for example. And my tests are not thorough because this a question, not an answer. — IS4, Aug 04 '17 at 21:06

How is "switch" compiled?

0 Answers0