-1

I'm working on program which has to repetitively compute:

uint64_t funct()
{
   if(x & 1)
    {
       x=(x * a + b) >> 1;
    }
    else
    {
        x=(x >> 1) * c + d;
    }
    return x;
}

But it is about 2 times slower than just:

uint64_t funct()
{
    x=(x * a + b) >> 1;
    return x;
}

a,b,c,d,x are some 64-bit numbers and x is randomly odd or even. In general I wouldn't expect this, because the main operation in both cases is multiplication with addition. I don't understand why simple condition slows it down twice.

Can it be done somehow faster?

Jason Aller
  • 3,541
  • 28
  • 38
  • 38
Tom
  • 129
  • 7
  • may x=(x >> 1) * c + d; much slow than x=(x * a + b) >> 1; because value of c,x( is big) – long.kl Mar 15 '22 at 05:02
  • Those functions are different. – 273K Mar 15 '22 at 05:12
  • @273K they are different, but I expected that they will perform similarly fast, because they both compute more or less the same, only one does it with the condition (but after condition it compute almost the same formula, whether it is satisfied or not). Putting a condition there slows it down as if we computed x = (x * a + b) >> 1 twice, with no condition. Condition is similarly slow as computing x=(x * a + b) >> 1 on quite big numbers. This is a paradoxical result for me. – Tom Mar 15 '22 at 05:39
  • 2
    Making a test is not free. The processor has to wait the result of the comparison before deciding what to do... note that the architectures of processors are now quite complex, and it is very difficult to evaluate the cost of an instruction without benchmarking. – Damien Mar 15 '22 at 05:47
  • @Damien maybe I measured it wrong. I just run program (and this is the main component of the program) in windows console and there is a execution time. – Tom Mar 15 '22 at 05:51
  • What I called a test is the `if(.)`. There are better ways to measure the execution time, but if you measure a large difference like that... Pay attention to the compilation flags -O2 for example. – Damien Mar 15 '22 at 05:55
  • @Damien of coruse, I wrote about that "it is very difficult to evaluate the cost of an instruction without benchmarking". That's why I wrote my testing may not be objective. – Tom Mar 15 '22 at 06:00
  • By the way I made comparison in Python and it make almost no difference if I put there condition or not. Of course Python is much slower and it is different language. – Tom Mar 15 '22 at 06:06
  • It's unclear what you want. You're asking whether something can be faster based on the existence of an unrelated thing being faster. Well, doing nothing takes no time, so that's faster. – Passer By Mar 15 '22 at 06:38
  • @PasserBy I want to compute this first function faster if it is even possible and know is it normal that if condition is so slow as multiplying 64-bit numbers. – Tom Mar 15 '22 at 07:59
  • Ok. I know why it is so slow. It is not because if condition, but because in if condition is big number to check - x - whcch is 64-bit. If x would be smaller it is fast again. So solution is store some smaller number equal to x (mod 2) and put it in a condition. Then this two functions would achieve comparable performance. – Tom Mar 15 '22 at 08:13
  • Try branchless - extra brackets for readability - `x= ( (x & 1) * ((x * a + b) >> 1 ) + ( (~x & 1) * ((x >> 1) * c + d) );` – Richard Critten Mar 15 '22 at 08:39
  • @RichardCritten It did not help. There is also no way to somehow precompute x & 1, because it still have to be computerd every time. Using if (x >> 63) condition is faster (and fortunately it would not be a problem in my project), and I think I'll change that condition to improve the speed a little. – Tom Mar 15 '22 at 08:58

1 Answers1

1

Are you compiling with max optimization flags? The compiler in some cases may optimize that condition.

In general I wouldn't expect this, because the main operation in both cases is multiplication with addition. I don't understand why simple condition slows it down twice.

A simple condition requires branching. It should take no time at all, unless if it's hot code. Branching too much in hot code can significantly slow down your program. Check this question and the answers, especially the benchmarking part.

As said in that answer, if you really need to optimize out the branching you will need to use some bit twiddling.

Offtkp
  • 366
  • 2
  • 9
  • I turned on O1, O2, O3 and it didn't change anything. Branching is definitely hot code in my program. But it still make no difference in Python. Maybe this is becuse multiplying in Python is much slower. – Tom Mar 15 '22 at 18:07