C evaluation based on boolean - which is faster, which is better style?

Question

It seems that in general, in C (and many other languages) one could equally well do, e.g.:

if (x > 0){
    y = value1;
}
else{
    y = value2;
}

or

y = (x>0)*value1 + (x <= 0)*value2;

The first case seems stylistically better because it's easier to read (for most people?). But then again the second case is more compact. More importantly, is there any difference in performance? It seems that the second case may be slower since both boolean expressions are evaluated, and there is a multiplication by zero.. but then, I vaguely recall that if statements have some small additional overhead? I realize I could actually measure the speed, but hoping someone has a more general answer.

Write the most readable code. When it does not work as fast as required then start doing profiling — Ed Heal, Oct 21 '16 at 15:49
I suggest putting the code into http://godbolt.org or just looking at the assembly produced by your favorite compilers. A sufficiently advanced compiler could compile your two bits of code to the same assembly. — David Grayson, Oct 21 '16 at 15:49
If you have code where [branch prediction could fail](http://stackoverflow.com/questions/11227809/why-is-it-faster-to-process-a-sorted-array-than-an-unsorted-array), then it would possibly be faster to use a branchless operation. It depends entirely on your data, your compiler's ability to optimize, and the specific applications of your program, though. — Random Davis, Oct 21 '16 at 15:50
Good compilers will probably turn this into the same code. For what it's worth, a branch-less implementation that uses bitwise operations on two-s complement hardware would look like this: `(value1&~(x>>8*sizeof(x)-1))|(value2&(x>>8*sizeof(x)-1))`. Do this only if you absolutely must. — Sergey Kalinichenko, Oct 21 '16 at 15:55
Re: performance, how often does the operation happen over the lifetime of your program? — John Bode, Oct 21 '16 at 15:55
If you really **need** max. performance, you should check your target very carefully and profile different variants. In addition to the two above, your CPU might provide speciai SIMD for parallel multiplication plus special select instructions (select seperate parts of two registers using a mask). Check the assembler output of the standard constructs first. You might have a clever compiler. — too honest for this site, Oct 21 '16 at 16:32
@DanB "many, many times" is something that indicates you haven't actually benchmarked it. A proper answer to the "how many times in the lifetime of your program" question would be "Not that often, but in a critical code path, a couple million times per second", or "nearly never, maybe a couple tens of thousand times overall". You see, different people would understand **very** different things under the term "many, many times". My guess is that with your vague formulation, it's not going to be critical. — Marcus Müller, Oct 23 '16 at 16:06

score 2 · Accepted Answer · answered Oct 21 '16 at 15:50

2

y = (x>0)*value1 + (x <= 0)*value2;

This is terrible code, and shouldn't be written (you'd need to be 100% sure that the comparison operators return 1 for "true"; whether that's the case is subject to a lot of things).

That's what the ternary operator in C-alike languages is for:

 y = (x > 0) ? value1 : value2;

More importantly, is there any difference in performance?

I'd love to have an argument on "importance", but:

no. The ternary operator should expand to the same bytecode as the if construct, if the compiler is worth anything. Your y=(cond)*a+(!cond)*b construct is more probable to be slower, because of the strange abuse of multiplication; but then again, modern, optimizing compilers might kill that, anyway.

answered Oct 21 '16 at 15:50

Marcus Müller

34,677
4
53
94

1

Relational operators are guaranteed to yield either `0` or `1`. Online draft of [C11 Standard](http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf), 6.5.9/6: "Each of the operators `<` (less than), `>` (greater than), `<=` (less than or equal to), and `>=` (greater than or equal to) shall yield 1 if the specified relation is true and 0 if it is false.107) The result has type `int`." – John Bode Oct 21 '16 at 15:53
I'd add this: To reduce computational time, you need to get rid of time consuming operations like '*' and '\' – Tony Tannous Oct 21 '16 at 15:54
It very much depends on the CPU and specific situation, e.g. if there are conditional statements or jumps required, branch-prediction, pipeline-length/penalty for branches, the probability of true/false, etc. If**F** time is an issue, all this has to be taken into account. – too honest for this site Oct 21 '16 at 16:30
@Olaf makes an interesting point here: for example, many CPU archs (x86, armv7, cuda[where comparisons end up being branches]) tend to have "conditional jumps", but no "save the result of a comparison into a variable" instructions – so that your multiplication solution would need more instructions, including the jump – Marcus Müller Oct 23 '16 at 15:44
1

@MarcusMüller: Actually e.g. ARM has conditional instructions, which can execute e.g. an addition, move, etc. depending on (comparison) flags. No jump necessary. On others, the multiplication can be faster, as it avoids prefetch queue flushes, etc. It might even be faster iff there are conditional instructions, depending on the operands/location of the operands due to superscalar execution. That's my point: it is not that simple. All I can agree with in your answer is: leave it to the compiler, write the most readable code, **until you prove it is not fast enough**. – too honest for this site Oct 23 '16 at 15:50
yep! In the absence of any better knowledge, one should probably write readable code, since, if anything, compilers will be written to optimize that within their abilities! – Marcus Müller Oct 23 '16 at 15:59

score 1 · Answer 2 · answered Oct 21 '16 at 15:56

The performance penalty involved in using an if is due to the fact that it involves branching. However, expressions like (x > 0) most likely also involve branching. Also, in your single expression, as you mentioned, you're evaluating two different conditions, whereas in the if you're only evaluating one. As others have mentioned, the compiler may well optimize the single expression and the if into the same code. Ultimately, the if is much better because it's clear what it's doing. As @Ed mentioned, if you want to optimize performance, do profiling. Then you can focus on the parts of your code that take the most time.

score 1 · Answer 3 · answered Oct 21 '16 at 17:54

First code for correctness, then for clarity (the two are often connected, of course!). Finally, and only if you have real empirical evidence that you actually need to, you can look at optimizing. Premature optimization really is evil. Optimization almost always costs you time, clarity, maintainability. You'd better be sure you're buying something worthwhile with that.

y = (x>0)*value1 + (x <= 0)*value2;

Don't use it in any of your code. This is a good example on how to write terrible code because it is not intuitive at all. Also, whether you will get any performance gain or not, depends on your machine architecture (depends upon number of cycles taken by multiplication instruction of your architecture).

However, the conditional statements in C and C++ (e.g. if else), at the very lowest level (in the hardware), are expensive. In order to understand why, you have to understand how pipelines work. It can lead to pipeline flushes and decreasing the efficiency of the processor.

The Linux kernel uses optimization techniques for conditional statements and it is __builtin_expect. When working with conditional statements (if-else statements), we often know which branch is true and which is not (most probable). If compiler knows this information in advance, it can generate most optimized code.

#define likely(x)      __builtin_expect(!!(x), 1)
#define unlikely(x)    __builtin_expect(!!(x), 0)

if (likely(x > 0)) {
    y = value1;
} else {
    y = value2;
}

For above example, i have marked if condition as likely() true, so compiler will put true code immediately after branch, and false code within the branch instruction. In this way compiler can achieve optimization. But don’t use likely() and unlikely() macros blindly. If prediction is correct, it means there is zero cycle of jump instruction, but if prediction is wrong, then it will take several cycles, because processor needs to flush it’s pipeline which is worst than no prediction.

I somewhat disagree. On systems like modern x86 with modern compilers, an `if` with a simple assignment will *not* lead to a pipeline flush in case of a wrong prediction; why should it? There's no complex state to be discarded if the branch prediction failed, just a different value to load into a register. However, your advice on helping the compier help the branch prediction is very valid. — Marcus Müller, Oct 23 '16 at 16:02

C evaluation based on boolean - which is faster, which is better style?

3 Answers3