0

Which is the fastest way to limit a number to 64 inclusive (ie. <= 64)

len = len > 64 ? 64 : len; /* limit to 64 */

Thanks!

yo3hcv
  • 1,531
  • 2
  • 17
  • 27
  • "Which" suggests you have one or more other ways in mind. What do you want to compare against? – Fred Larson May 23 '18 at 16:05
  • @FredLarson probably against the way included in the question. – Christian Gibbons May 23 '18 at 16:05
  • Modulo 65 will be expensive, modulo 64 will be cheap (==equivalent to &63 ). That will give you numbers [0-63], but it won't give you 63 (or 64) for numbers >=64. It'll wrap around back to 0. If you're OK with that, this might be very slightly faster than the if, but I'd expect this to be tight. At least on my machine, if's aren't that expensive. – Petr Skocik May 23 '18 at 16:22
  • 3
    This probably compiles into three instructions: one load, one compare, and one conditional move. (Check the assembly for yourself.) You are going to have trouble beating that. If you are doing this sort of thing on many values, vectorizing the code -- whether done by hand or automatically by the compiler -- will make a much bigger difference than any micro-optimization. – Nemo May 23 '18 at 16:30
  • 1
    What problem are you having with this? –  May 23 '18 at 16:37
  • 2
    If your target cpu contains `min` instruction, this will generate one instruction: `min.u r0, #64`. This is available eg in x86 mmx/sse. – Aki Suihkonen May 23 '18 at 16:44
  • 3
    What type is 'len'? – Martin James May 23 '18 at 17:04

2 Answers2

3

Don't bother. The compiler optimizes better than you could.

You might perhaps try

len = ((len - 1) & 0x3f) + 1;

(but when len is 0 -or 65, etc...- this might not give what you want)

If that is so important for you, benchmark!

Basile Starynkevitch
  • 223,805
  • 18
  • 296
  • 547
  • 3
    Your code doesn't work for `len` of 65 (among an infinity of others). Your first sentence is better. – Nemo May 23 '18 at 16:25
3

I created a program

#include <stdio.h>

int main(void) {
    unsigned int len;
    scanf("%u", &len);
    len = len > 64 ? 64 : len;
    printf("%u\n", len);
}

and compiled with gcc -O3 and it generated this assembly:

cmpl    $64, 4(%rsp)
movl    $64, %edx
leaq    .LC1(%rip), %rsi
cmovbe  4(%rsp), %edx

the leaq there loads the "%u\n" string in between - I presume it is because the timing of the instructions. The generated code seems pretty efficient. There are no jumps, just a conditional move. No branch prediction failure.

So the best way to optimize your executable is to get a good compiler.