I recently heard of the idea of branchless programming and I want to give it a try and see if it can boost performance. I have the following C function.
int square(int num) {
int result = 0;
if (num > 10) {
result += num;
}
return result * result;
}
After removed the if branch, I have this:
int square(int num) {
int result = 0;
int tmp = num > 10;
result = result * tmp + num * tmp + result * !tmp;
return result * result;
}
Now I want to know whether the branchless version if faster. I searched around and found a tool called hyperfine (https://github.com/sharkdp/hyperfine). So I wrote the following main
function and test the two versions of the square
function with hyperfine
.
int main() {
printf("%d\n", square(38));
return 0;
}
The problem is that based on the hyperfine result, I can't determine which version is better. In C programming, how does people usually determine which version of a function is faster?
Below is some of my hyperfine
result.
C:\my_projects\untitled>hyperfine branchless.exe
Benchmark #1: branchless.exe
Time (mean ± σ): 5.4 ms ± 0.2 ms [User: 2.2 ms, System: 3.2 ms]
Range (min … max): 4.9 ms … 6.1 ms 230 runs
C:\my_projects\untitled>hyperfine branch.exe
Benchmark #1: branch.exe
Time (mean ± σ): 6.1 ms ± 0.7 ms [User: 2.2 ms, System: 3.7 ms]
Range (min … max): 5.0 ms … 9.7 ms 225 runs
C:\my_projects\untitled>hyperfine branch.exe
Benchmark #1: branch.exe
Time (mean ± σ): 5.5 ms ± 0.3 ms [User: 2.1 ms, System: 3.5 ms]
Range (min … max): 4.9 ms … 7.0 ms 211 runs
C:\my_projects\untitled>hyperfine branch.exe
Benchmark #1: branch.exe
Time (mean ± σ): 5.6 ms ± 0.4 ms [User: 2.0 ms, System: 3.9 ms]
Range (min … max): 4.8 ms … 7.0 ms 217 runs
Warning: Command took less than 5 ms to complete. Results might be inaccurate.
C:\my_projects\untitled>hyperfine branch.exe
Benchmark #1: branch.exe
Time (mean ± σ): 5.7 ms ± 0.3 ms [User: 1.9 ms, System: 4.0 ms]
Range (min … max): 5.0 ms … 6.6 ms 220 runs
C:\my_projects\untitled>hyperfine branchless.exe
Benchmark #1: branchless.exe
Time (mean ± σ): 5.6 ms ± 0.3 ms [User: 1.9 ms, System: 3.9 ms]
Range (min … max): 4.8 ms … 6.9 ms 219 runs
C:\my_projects\untitled>hyperfine branchless.exe
Benchmark #1: branchless.exe
Time (mean ± σ): 5.8 ms ± 0.3 ms [User: 1.5 ms, System: 4.0 ms]
Range (min … max): 5.2 ms … 7.3 ms 224 runs
C:\my_projects\untitled>