Fast add/subtract two integers based on sign of third integer

Question

I am looking for a fast (branch-less) code to compute the following:

if ( k > 0 )
    x += i;
else if ( k < 0 )
    x -= i;
else if ( k == 0 ) 
    // do nothing.

where k, x, and i are of int type.

It is fine for the solution to use intrinsics.

You've busted the compiler generating inefficient code for this? — user4581301, Nov 03 '20 at 22:42
user2052436, Approaches using `x += signof(k)*i;` suffer from potential UB when `i == INT_MIN, k < 0`. Do you care about this corner case? — chux - Reinstate Monica, Nov 03 '20 at 23:24
The question and all the currently posted answers are all wild speculation. Discussing this without showing the generated machine code for some CPU is quite pointless. — Lundin, Nov 04 '20 at 09:30
@Lundin the nature of a micro-optimization question like this is that there's no single best answer. The only thing you can do is collect a number of different approaches and benchmark them in your own final configuration. There's already an assumption built into the question that branchless will be faster, but that's not a given either. — Mark Ransom, Nov 04 '20 at 16:15

cigien · Answer 1 · 2020-11-03T22:57:19.530

4

This answer shows a fast, branchless way of computing the sign of a value. Assuming that's correct, you can write:

x += i * ((0 < k) - (k < 0))

edited Nov 03 '20 at 22:57

answered Nov 03 '20 at 22:53

cigien

57,834
11
73
112

2

This very much depends on the compiler having a branchless way to evaluate `0 < k` and `k < 0`. – Mark Ransom Nov 03 '20 at 22:54
@MarkRansom Couldn't say for sure. The linked answer claims it's branchless, and I'm just building upon that. Is the phrasing better now? – cigien Nov 03 '20 at 22:55

dbush · Answer 2 · 2020-11-03T22:55:34.873

2

Try this:

int sign = !!k - !!(k & INT_MIN) * 2;
x += i * sign;

This assumes two's complement representation of integers, resulting in INT_MIN having only the high-order bit set.

edited Nov 03 '20 at 22:55

answered Nov 03 '20 at 22:50

dbush

205,898
23
218
273

@MarkRansom Added assumptions. – dbush Nov 03 '20 at 22:55
1

Worth noting that since C++20 this assumption is guaranteed. – eerorika Nov 03 '20 at 22:58
@eerorika did C++20 mandate two's complement integer representation? That's quite a leap. In the past they've always tried to be accommodating to weird architectures. – Mark Ransom Nov 03 '20 at 23:12
@MarkRansom They did. I don't know if there are C++20 implementations for other archs but they will have to emulate two's complement if they want to conform. – eerorika Nov 03 '20 at 23:15
Why not replace `INT_MIN` with `~INT_MAX` and forget about whether or not it's two's complement? – Christian Gibbons Nov 03 '20 at 23:20

score 1 · Answer 3 · answered Nov 03 '20 at 22:44

Just compile your code with good optimisation settings, get a disassembly of your code, and look at the result. Then change your code until the disassembly shows what you want. There's a chance that you already get branchless code. There's a better chance if you write

if ( k > 0 )
    x += i;
if ( k < 0 )
    x -= i;

because the else in between makes it harder for the compiler to avoid branches.

score 1 · Answer 4 · edited Nov 04 '20 at 10:43

1

x = x + i*(k>0) - i*(k<0);

But, honestly, given the many optimization possibilities of modern compilers, I wouldn't trade clarity of code for aleggedly faster code. For the x86 platform, for example, MMX and SSE units can do this operation with no branches.

edited Nov 04 '20 at 10:43

cigien

57,834
11
73
112

answered Nov 03 '20 at 22:54

mcleod_ideafix

11,128
2
24
32

score 0 · Answer 5 · answered Nov 03 '20 at 22:53

0

You seem to be expecting code that is "branchless" to be measurably and importantly "faster?" With maximum optimization settings turned on, exactly what instruction sequence does your compiler now produce? Have you then profiled that code to confirm that it is not fast enough?

answered Nov 03 '20 at 22:53

Mike Robinson

8,490
5
28
41

2

While it's a fair question, I feel like this is more of a comment than an answer. – cigien Nov 03 '20 at 23:06

score 0 · Answer 6 · edited Nov 03 '20 at 23:20

Try looking at the assembly language for this:

if (k != 0)
{
    const int multiplier = (k > 0) ? 1 : -1;
    x += (multiplier) * i;
}

Some processors may have conditional instructions that are only executed depending on the condition code (like ARM).

Look at the assembly code printed by your compiler.

Edit 1: performance concerns
A lot of modern processors have enough space in their instruction cache to that they can handle the above code efficiently. The comparison should be quick for the processor's branch decision making circuitry.

Profile or Benchmark before making any judgements about code having to be one line.

Remy Lebeau · Answer 7 · 2020-11-03T23:24:22.973

0

Just for kicks, I present this:

using opType = void(*)(int&, int);
const opType ops[3] =
{
    [](int&  , int  ){ },
    [](int& x, int i){ x += i; },
    [](int& x, int i){ x -= i; }
};

auto f = ops[int(k > 0) + (2 * int(k < 0))];
f(x, i);

Live Demo

Obviously, the other solutions using solely mathematics are going to be more efficient :) But this give you some options if you want to do more complex operations based on what k is.

edited Nov 03 '20 at 23:24

answered Nov 03 '20 at 23:06

Remy Lebeau

555,201
31
458
770

1

Now I'm itching to see the assembly for this :) – cigien Nov 03 '20 at 23:09
Oh, I wasn't actually asking for the link, but thanks :) Hmm, even at O3 there appear to be jumps that don't get optimized out. Not that I'm any good at reading assembly. – cigien Nov 03 '20 at 23:21

Hans Olsson · Answer 8 · 2020-11-04T15:26:56.377

0

Except for INT_MIN the following should work in C++11

x += (std::signbit(-k)-std::signbit(k))*i;

And the following should work in all cases, including INT_MIN - if long is longer than int:

x += (std::signbit(-(long)k)-std::signbit(k))*i;

Added: Internally std::signbit is in many compilers converted to a simple shift-operation, which should be branch-less and fast.

edited Nov 04 '20 at 15:26

answered Nov 04 '20 at 15:03

Hans Olsson

11,123
15
38

score -2 · Answer 9 · answered Nov 03 '20 at 22:54

-2

This is a branchless solution, but the time will vary based on your environment. It shouldn't take too much more than 0.005 seconds, so I think that won't be a problem.

k>0?x+=i:k<0?x-=i:0

answered Nov 03 '20 at 22:54

Lakshya Raj

1,669
3
10
34

1

Ternary operators are like branches or `if` statements. – Thomas Matthews Nov 03 '20 at 22:57

Fast add/subtract two integers based on sign of third integer

9 Answers9