Optimize Case Statement

Question

We have a method for writing command data to a device. The method first converts the data to the form accepted by the device and then writes the data to the serial port. The data conversion is done with a case statement as given below. For 10 commands we need to convert data. For other commands,we don't have to convert the data (around 10 commands).

Customer complaints that code is not optimized. Some of the commands for which data conversion is not required are frequently used.

Will direct if else statements optimize code?
Is there any other options to optimize the code in this case?

switch (cmd_no)
{
case CMD_WR_ACC:
    converted_command_data = (INT32)((((DOUBLE)cmd_data * CMD_WR_ACC_PARA1) / CMD_WR_ACC_PARA2) + 0.5);
    break;
case CMD_WR_BIAS:
    converted_command_data = (INT32)((((DOUBLE)cmd_data * CMD_WR_BIAS_PARA1) / CMD_WR_BIAS_PARA2) + 0.5);
    break;
case CMD_WR_SUP:
    converted_command_data = (INT32)((((DOUBLE)cmd_data * CMD_WR_SUP_PARA1) / CMD_WR_SUP_PARA2) + 0.5);
    break;
case CMD_WR_FIL:
    converted_command_data = (INT32)((((DOUBLE)cmd_data * CMD_WR_FIL_PARA1) / CMD_WR_FIL_PARA2) + 0.5);
    break;
    .
    .
default:
    converted_command_data = cmd_data;
    break;
}

cmd_data is INT32. But parameters are floating point contsants — Maanu, Mar 08 '17 at 14:34
First: measure it. Secondly: a construct of if/else ifs is effectively the worst case of a switch. — wildplasser, Mar 08 '17 at 14:36
*"Customer complaints that code is not optimized."* How you define *optimized*? Why is the code above not considered optimized? You have not defined the problem enough. — user694733, Mar 08 '17 at 14:36
What, exactly, are your customers complaining about? Is it executing too slowly? (that should be false if it isn't used... usually) Taking up too much space? Or are they just ignorant fools trying to tell you how to do your job and you need to diplomatically deal with nonsense? — Adam D. Ruppe, Mar 08 '17 at 14:37
Using `static inline` and perhaps an early if statement that handles the most frequent command (if there truly is one), should solve the problem. — 2501, Mar 08 '17 at 14:46
What are the definitions of CMD_WR_SUP_PARA1 etc, are they integers? — Lundin, Mar 08 '17 at 15:12
If `cmd_data` and all the `*_PARA1` and `*_PARA2` are integers, then you can rewrite the conversions presented to use only integer arithmetic. That could make those a bit faster. — John Bollinger, Mar 08 '17 at 15:29
@wildplasser No, a construct of `if - else if` is identical to a `switch`, on the machine code level. The only difference is that `if - else if` might be harder to read. — Lundin, Mar 08 '17 at 15:35
Given that you seem to consider the fact that "Some of the commands for which data conversion is not required are frequently used" to be relevant to the optimization question, I hypothesize that either you, the reviewer, or both of you suppose that it is comparatively expensive to reach the `default` block. That's probably wrong, but you might nevertheless mollify the reviewer simply by putting the `default` block first in the switch. Some people recommend that anyway as a matter of style. There is no reason to think that it will really make the code any faster, however. — John Bollinger, Mar 08 '17 at 16:01
@Lundin I wrote *the worst case* . That is: primitive compilers and the *I could not find any smarter way* for non-primitive compilers. — wildplasser, Mar 08 '17 at 16:06
Rather than using incorrect rounding `(INT32)((((DOUBLE)cmd_data * CMD_WR_ACC_PARA1) / CMD_WR_ACC_PARA2) + 0.5)` use `lrint(1.0*cmd_data * CMD_WR_ACC_PARA1 / CMD_WR_ACC_PARA2)`. — chux - Reinstate Monica, Mar 08 '17 at 16:33

dbush · Answer 1 · 2017-03-08T15:20:10.790

5

If by "not optimized" your customer means repetitive, it looks like you can put a good deal of this logic in a table.

Assuming your command types are sequentially numbered, i.e.

#define CMD_WR_ACC 0
#define CMD_WR_BIAS 1
#define CMD_WR_SUP 2
...

You define the table like this:

struct params {
    int convert;       // whether or not to convert
    double param1;
    double param2;
} params_table[] = {
    { 1, CMD_WR_ACC_PARA1, CMD_WR_ACC_PARA2 },
    { 1, CMD_WR_BIAS_PARA1, CMD_WR_BIAS_PARA2 },
    { 1, CMD_WR_SUP_PARA1, CMD_WR_SUP_PARA2 },
    ...
    { 0, 0, 0}
    ...
};

Then your code looks like this:

if (params_table[cmd_no].convert) {
    converted_command_data = (INT32)((((DOUBLE)cmd_data * 
         params_table[cmd_no].param1) / params_table[cmd_no].param2) + 0.5);
} else {
    converted_command_data = cmd_data;
}

If the starting index of your commands is not 0, you'll need to subtract the lowest command number from cmd_no to get the index into the table.

edited Mar 08 '17 at 15:20

answered Mar 08 '17 at 14:51

dbush

205,898
23
218
273

Nice refactoring +1. It probably isn't any faster though or significantly smaller than the switch. – JeremyP Mar 08 '17 at 14:54
The customer complained about the optimization during code review. I have to check what bothers him. The refactoring will not work as for other commands, the formulas are different. Profiling is not possible as itron system is not ready. – Maanu Mar 08 '17 at 14:58
Because the result is truncated to 32 bits, and a `double` has 52 bits of mantissa, you can safely combine the multiplier and the divisor into a single multiplicative double constant. Most C compilers cannot infer that, so it might even yield a runtime speedup. Also, your if clause should probably read `if (params_table[cmd_no].convert)`. – Nominal Animal Mar 08 '17 at 15:11
@NominalAnimal Thanks. Edited. – dbush Mar 08 '17 at 15:22
The compiler is probably smart enought to optimize the switch. Look at the generated assembly to get an idea. But this solution has the advantage of making the code cleaner. You can have a table of functions that perform the conversion, but that feels like overkill. – icebp Mar 08 '17 at 18:39

score 3 · Answer 2 · edited May 23 '17 at 12:02

Will direct if else statements optimize code?

No, a list of if - else if is the very same things as using switch on the machine code level, given that the if - else if is of the nature

if(integer == 1) 
{ ... }  
else if (integer == 2) 
{ ... }

Where 1 and 2 are any kind of compile-time integer constants. In which case it will yield 100% equivalent machine code as

switch(integer)
{
  case 1: ... break;
  case 2: ... break;
}

Is there any other options to optimize the code in this case?

A few things:

In case the constants used by the switch are adjacent, preferably going from 0 to n, the whole switch could be replaced by a function pointer jump table. Modern compilers should do just that behind the lines, but older compilers may struggle.
It isn't obvious while floating point is needed here, since you cast the result to int. Replacing floating point calculations with integer calculations might improve performance quite a bit, especially on microcontroller systems etc that lack FPU. If the only purpose of the float numbers is to round division, then consider that
```
// probably slow
int32_t a = (int32_t)((((double)cmd_data * CMD_WR_ACC_PARA1) / CMD_WR_ACC_PARA2) + 0.5);
```
is equivalent to
```
// probably faster
int32_t b = ( (cmd_data * CMD_WR_ACC_PARA1) + CMD_WR_ACC_PARA2/2 ) / CMD_WR_ACC_PARA2;
```
(See Rounding integer division (instead of truncating))

In fact, an `if` / `else` / `if` is not necessarily compiled to the same thing as a `switch`. A compiler is more likely to compile a `switch` to use a jump table than it is to do the same with an `if` / `else` / `if` block. Since the alternative is to test each condition in turn, a `switch` may sometimes yield different and more efficient code. — John Bollinger, Mar 08 '17 at 15:36

technosaurus · Answer 3 · 2017-03-09T07:23:06.953

This does not even need a case statement since it is easily vectorizable. One of the easiest ways to do this is to use an X-macro.

Since no values were provided, I am just using sequential integers as placeholders

#define XMACRO(X,...) \
  /*Label, Parameter1, Parameter2,...*/ \
  X(ACC, 1.0, 2.0, __VA_ARGS__) \
  X(BIAS, 3.0, 4.0, __VA_ARGS__) \
  X(SUP, 5.0, 6.0, __VA_ARGS__) \
  X(FILL, 7,0, 8.0, __VA_ARGS__) \
  //more entries here

#define AS_ENUM(label,...) CMD_WR_##label,
enum commands { XMACRO(AS_ENUM) CMD_WR_COUNT };

#define AS_PREMULTIPLIED(label, p1, p2,...) ((p1)/(p2)),
const double convert_multipliers[CMD_WR_COUNT] = { XMACRO(AS_PREMULTIPLIED) };

int convert_data(int cmd_data, unsigned cmd_no){
return (cmd_no >= CMD_WR_COUNT) 
  ? cmd_data
  : (int) (cmd_data * convert_multipliers[cmd_no] + 0.5);
}

Edit: It's actually possible that the switch statement is more optimal if the code is called often, but not in tight loops since most compilers can generate a jump table. The other parameters may even be compiled into a single constant if the correct options are enabled (such as -Ofast with gcc). Fortunately the X-macro can handle your case statements too:

#define AS_CASE(label, p1, p2,...) case CMD_WR_##label : \
  return (int) (cmd_data * ((p1)/(p2)) + 0.5);

switch(cmd_no){
  XMACRO(AS_CASE)
  default: return cmd_data;
}

The only way to know if its performance optimized for sure is to measure, but X-macros will help eliminate repetitive code and allow you to keep the data together in an easily readable and modifiable tabular format (and at least look optimized). When you have to add another parameter down the road, you will only need to add a row instead of grokking through every function.

score 0 · Answer 4 · edited May 23 '17 at 10:30

Rewriting your code from a switch to a if/then/else is not going to improve it much. The first thing to try, is compiling with the optimization options set to the highest. Let the compiler, gcc I assume, optimize it for you. That is where the low hanging fruit usually is.

How can this help? GCC by default balances code speed optimization with code size and debugging ease. By default, gcc generates that is easy to debug because the code flow more closely matches the source code, and Variables are stored (which is slow) but this enables easier debugging using gdb and breakpoints, since the variables will be easily visible on the stack.

When we turn on optimization, more optimization is done including storing some local variables only in processor registers (which are fast) and also by reordering the code to optimize code flow through the cpu execution pipeline. The downside is it is harder to follow when single stepping through the machine code. It also takes slightly longer to compile. But it is worth it!

To compile with optimization flags set to their highest, try -O3 which sets optimization more for code size and execution time:

-O3

There is a good discussion about the optimization flags here: https://stackoverflow.com/a/32941238/6693299

After compiling with the optimization flag set, look to the math which is using expensive floating point operations. Do you need them? Double multiplications and divisions are expensive, just to convert them to int32s? Are you sure this needs to be done. Can you convert the math to use only integer operations?

Can you share the values for the listed constants? You might be able to convert some or all of the math into a lookup table.

I'd say that rewriting to use `if`/`then`/`else` instead of `switch` will not help *at all*. There's no reason to think that the compiler would produce smaller or faster code that way, and the resulting source would be less readable (to me). — John Bollinger, Mar 08 '17 at 15:22
I agree @JohnBollinger. I said the same thing. Let the compiler do the optimization. That is what I suggested. That, and look if it is possible to eliminate double floating points using integer math or a lookup table. — ScottK, Mar 08 '17 at 15:28

Optimize Case Statement

4 Answers4