Large Switch statements: Bad OOP?

Question

I've always been of the opinion that large switch statements are a symptom of bad OOP design. In the past, I've read articles that discuss this topic and they have provided altnerative OOP based approaches, typically based on polymorphism to instantiate the right object to handle the case.

I'm now in a situation that has a monsterous switch statement based on a stream of data from a TCP socket in which the protocol consists of basically newline terminated command, followed by lines of data, followed by an end marker. The command can be one of 100 different commands, so I'd like to find a way to reduce this monster switch statement to something more manageable.

I've done some googling to find the solutions I recall, but sadly, Google has become a wasteland of irrelevant results for many kinds of queries these days.

Are there any patterns for this sort of problem? Any suggestions on possible implementations?

One thought I had was to use a dictionary lookup, matching the command text to the object type to instantiate. This has the nice advantage of merely creating a new object and inserting a new command/type in the table for any new commands.

However, this also has the problem of type explosion. I now need 100 new classes, plus I have to find a way to interface them cleanly to the data model. Is the "one true switch statement" really the way to go?

I'd appreciate your thoughts, opinions, or comments.

How about use expression? So find the function by its name and call it. Name can be easily match with passed variables, so then we don't need to use switch. — creator, Dec 28 '16 at 04:09

score 35 · Accepted Answer · answered Feb 02 '09 at 23:59

35

You may get some benefit out of a Command Pattern.

For OOP, you may be able to collapse several similar commands each into a single class, if the behavior variations are small enough, to avoid a complete class explosion (yeah, I can hear the OOP gurus shrieking about that already). However, if the system is already OOP, and each of the 100+ commands is truly unique, then just make them unique classes and take advantage of inheritance to consolidate the common stuff.

If the system is not OOP, then I wouldn't add OOP just for this... you can easily use the Command Pattern with a simple dictionary lookup and function pointers, or even dynamically generated function calls based on the command name, depending on the language. Then you can just group logically associated functions into libraries that represent a collection of similar commands to achieve manageable separation. I don't know if there's a good term for this kind of implementation... I always think of it as a "dispatcher" style, based on the MVC-approach to handling URLs.

answered Feb 02 '09 at 23:59

nezroy

3,166
22
17

Agree with dictionary and delegate idea. You will probably also get a performance improvement using Dictionary over a big switch statement. The difficulty is in initializing it, though this can be automated using reflection if desired. – Zooba Feb 03 '09 at 01:18
Perhaps the initialization could be pulled from configuration. – davogones Feb 03 '09 at 02:53
The problem with reflection is that many web hosts runs in a trust-level that don't allow you to use it :( – Feb 03 '09 at 08:34
Interestingly enough if the only difference between the branches is that they are different subclasses of a base then you are basically describing a factory. – Thedric Walker Feb 18 '09 at 13:50

score 24 · Answer 2 · answered Feb 03 '09 at 00:10

I see having two switch statements as a symptom of non-OO design, where the switch-on-enum-type might be replaced with multiple types which provide different implementations of an abstract interface; for example, the following ...

switch (eFoo)
{
case Foo.This:
  eatThis();
  break;
case Foo.That:
  eatThat();
  break;
}

switch (eFoo)
{
case Foo.This:
  drinkThis();
  break;
case Foo.That:
  drinkThat();
  break;
}

... should perhaps be rewritten to as ...

IAbstract
{
  void eat();
  void drink();
}

class This : IAbstract
{
  void eat() { ... }
  void drink() { ... }
}

class That : IAbstract
{
  void eat() { ... }
  void drink() { ... }
}

However, one switch statement isn't imo such a strong indicator that the switch statement ought to be replaced with something else.

The problem is that you don't know if you receive a This or a That command over the network. Somewhere you have to decide whether to call new This() or new That(). After that abstracting with OO is a piece of cake. — , Feb 03 '09 at 08:32

score 20 · Answer 3 · answered Feb 03 '09 at 00:32

20

The command can be one of 100 different commands

If you need to do one out of 100 different things, you can't avoid having a 100-way branch. You can encode it in control flow (switch, if-elseif^100) or in data (a 100-element map from string to command/factory/strategy). But it will be there.

You can try to isolate the outcome of the 100-way branch from things that don't need to know that outcome. Maybe just 100 different methods is fine; there's no need to invent objects you don't need if that makes the code unwieldy.

answered Feb 03 '09 at 00:32

Jonas Kölker

7,680
3
44
51

13

+1. I get disheartened hearing folks complain about switches being smelly, but then suggest an alternative that is really just abstracted switch behavior smeared out across the entire system, as opposed to a neat stack. – Sep 18 '13 at 19:00
2

This might be pedantic, but is a map (string->command) really a branch? Conceptually it might be, in that there are multiple things that might happen depending on the value of a variable, but when I think of branching I think of something that explicitly checks a variable for possible values. Whereas a map is a direct lookup. In other words, the association of condition to result is done once at the start (at map population) rather than every time (by testing the variable). Under that definition, the map of commands does remove the branch. – Cam Jackson Mar 07 '14 at 05:12
2

@CamJackson: Sure, if you use a map you can use it with code that doesn't do any branching itself---it has outsourced the branching to whatever `map_lookup` function it calls (which will branch). I'm not sure what that observation buys you. "It's not really a branch if it happens elsewhere"? The association between input and output is done at compile time for switch (and other control flow) statements, and at run-time for data structures (most likely). It's not like the shape of the switch changes at run-time. Both kinds of look-up require (and one is) a branch. I hope that helps :-) – Jonas Kölker Mar 07 '14 at 22:06
@JonasKölker That's a good point. I guess I didn't think carefully about the actual data structure and how it works. So the only real way to remove the branch would be if your 100 commands conveniently happened to be called 0, 1, 2, etc, so that you could use them as indices directly. – Cam Jackson Mar 10 '14 at 01:34
@CamJackson: I think I agree---in that case, the only branching that will happen is due to the fact that what is stored in your 100-element array will be put into the program counter (in java this happens indirectly and behind the scenes). Here, I define branching to mean that after the program counter assumes one particular value, it can assume multiple different next values (under normal program execution, no cosmic rays). (Total nitpicking: if your commands were named 0..17 and 19..100, you could have an 101-element array A where A[18] is an error handler. You can generalize further.) – Jonas Kölker Mar 10 '14 at 18:46

score 4 · Answer 4 · answered Feb 03 '09 at 00:07

4

I think this is one of the few cases where large switches are the best answer unless some other solution presents itself.

answered Feb 03 '09 at 00:07

Loren Pechtel

8,945
3
33
45

score 3 · Answer 5 · answered Feb 02 '09 at 23:44

I see the strategy pattern. If I have 100 different strategies...so be it. The giant switch statement is ugly. Are all the Commands valid classnames? If so, just use the command names as class names and create the strategy object with Activator.CreateInstance.

score 3 · Answer 6 · answered Feb 06 '09 at 21:29

3

There are two things that come to mind when talking about a large switch statement:

It violates OCP - you could be continuously maintaining a big function.
You could have bad performance: O(n).

On the other hand a map implementation can conform to OCP and could perform with potentially O(1).

answered Feb 06 '09 at 21:29

quamrana

37,849
12
53
71

1

"a map [can be O(1)]" -- if the lookup reads k bits, it can distinguish between 2**k different keys. Thus, for n keys, you need to read at least log(n) bits. How can you do that in O(1) time when log(n) is unbounded? In which model? – Jonas Kölker Sep 18 '13 at 19:38
@JonasKölker: A map can be implemented with a hash table which can find the correct key with `O(1)`. See this answer: http://stackoverflow.com/a/1055261/4834 – quamrana Sep 18 '13 at 19:43
The hash function still has to process those log(n) bits. Hash tables (and some kinds of lookup tree) are O(1) with respect to values contained, but O(n) or worse with respect to the bit size of key values. – C. A. McCann Sep 20 '13 at 16:25
6

Any decent compiler uses a table for switch statements with consecutive case values, or a binary decision tree where this doesn't work or where based on the good knowledge of the processor architecture the binary decision tree is faster. For case statements with many values that are far apart, a good compiler will hash the value to reduce it to a table if that is beneficial. No matter what, the compiler will use the strategy that implements the switch statement in the fastest possible way. So excuse me, but refusing a switch statement for performance reasons is stupid. – gnasher729 Feb 19 '14 at 22:50

score 1 · Answer 7 · answered Feb 03 '09 at 08:19

I'd say that the problem is not the big switch statement, but rather the proliferation of code contained in it, and abuse of wrongly scoped variables.

I experienced this in one project myself, when more and more code went into the switch until it became unmaintainable. My solution was to define on parameter class which contained the context for the commands (name, parameters, whatever, collected before the switch), create a method for each case statement, and call that method with the parameter object from the case.

Of course, a fully OOP command dispatcher (based on magic such as reflection or mechanisms like Java Activation) is more beautiful, but sometimes you just want to fix things and get work done ;)

score 1 · Answer 8 · edited Dec 30 '10 at 01:36

1

You can use a dictionary (or hash map if you are coding in Java) (it's called table driven development by Steve McConnell).

edited Dec 30 '10 at 01:36

Peter Mortensen

30,738
21
105
131

answered Feb 12 '09 at 16:06

Nicolas Dorier

7,383
11
58
71

score 0 · Answer 9 · answered Sep 17 '13 at 19:51

I have recently a similar problem with a huge switch statement and I got rid off the ugly switch by the most simple solution a Lookup table and a function or method returning the value you expect. the command pattern is nice solution but having 100 classes is not nice I think. so I had something like:

switch(id)
    case 1: DoSomething(url_1) break;
    case 2: DoSomething(url_2) break;
    ..
    ..
    case 100 DoSomething(url_100) break;

and I've changed for :

string url =  GetUrl(id);  
DoSomthing(url);

the GetUrl can go to DB and return the url you are looking for, or could be a dictionary in memory holding the 100 urls. I hope this could help anyone out there when replacing a huge monstrous switch statements.

score 0 · Answer 10 · answered Jul 03 '14 at 13:55

0

Think of how Windows was originally written in the application message pump. It sucked. Applications would run slower with the more menu options you added. As the command searched for ended further and further towards the bottom of the switch statement, there was an increasingly longer wait for response. It's not acceptable to have long switch statements, period. I made an AIX daemon as a POS command handler that could handle 256 unique commands without even knowing what was in the request stream received over TCP/IP. The very first character of the stream was an index into a function array. Any index not used was set to a default message handler; log and say goodbye.

answered Jul 03 '14 at 13:55

Robert Achmann

1,986
3
40
66

3

That's simply not true. Switch statements use a jump table, so it takes no longer to switch on 2 items as it does on 2000. It's a single table lookup. – Erik Funkenbusch Jul 03 '14 at 15:21
Even when the switch statement compares on string values? – Robert Achmann Jul 03 '14 at 15:29
Well, obviously it depends on the length of the string value, but other than that, yes. Windows uses C, which can't switch on strings anyways, but in C# you can. It's still a lookup table. – Erik Funkenbusch Jul 03 '14 at 18:24

score 0 · Answer 11 · answered Feb 02 '09 at 23:43

One way I see you could improve that would make your code driven by the data, so for example for each code you match something that handles it (function, object). You could also use reflection to map strings representing the objects/functions and resolve them at run time, but you may want to make some experiments to assess performance.

score 0 · Answer 12 · answered Feb 03 '09 at 00:11

The best way to handle this particular problem: serialization and protocols cleanly is to use an IDL and generate the marshaling code with switch statements. Because whatever patterns (prototype factory, command pattern etc.) you try to use otherwise, you'll need to initialize a mapping between a command id/string and class/function pointer, somehow and it 'll runs slower than switch statements, since compiler can use perfect hash lookup for switch statements.

I'm curious what compiler implements switch statements of strings using perfect hashing? — oz10, Feb 03 '09 at 04:20

score 0 · Answer 13 · answered Feb 03 '09 at 02:42

0

Yes, I think large case statements are a symptom that one's code can be improved... usually by implementing a more object oriented approach. For example, if I find myself evaluating the type of classes in a switch statement, that almost always mean I could probably use Generics to eliminate the switch statement.

answered Feb 03 '09 at 02:42

cyclo

371
2
6

1

The problem is that this data is coming from elsewhere, it can't be object oriented as it comes in. Something needs to be done to convert it to object orientation and generally that's a large switch. – Loren Pechtel Feb 03 '09 at 05:24

score 0 · Answer 14 · answered Feb 03 '09 at 08:05

You could also take a language approach here and define the commands with associated data in a grammar. You can then use a generator tool to parse the language. I have used Irony for that purpose. Alternatively you can use the Interpreter pattern.

In my opinion the goal is not to build the purest OO model, but to create a flexible, extensible, maintainable and powerful system.

Large Switch statements: Bad OOP?

14 Answers14

Linked

Related