13

How can I demonstrate for students the usability of likely and unlikely compiler hints (__builtin_expect)?

Can you write an sample code, which will be several times faster with these hints comparing the code without hints.

jww
  • 97,681
  • 90
  • 411
  • 885
osgx
  • 90,338
  • 53
  • 357
  • 513
  • 2
    http://kerneltrap.org/node/4705 – T.J. Crowder Apr 29 '10 at 16:11
  • @T.J. Crowder, yes. but I want a program sample in which students can FEEL the difference, not to read it in assembler. – osgx Apr 29 '10 at 16:35
  • 1
    @osgx: You won't get such a sample, because there is no difference to be felt. This is the worst, ugliest kind of useless micro-optimization. – R.. GitHub STOP HELPING ICE Feb 06 '11 at 08:14
  • @R.., but Linux kernel actively using It. Also, the programmer knows a lot about his program and can mark rare used code better than compiler can detect this. – osgx Feb 07 '11 at 00:08
  • Linux kernel is actively writing all sorts of things that are horrible C. I agree the programmer can better know which branch is more likely, but that doesn't mean I support ugly premature micro-optimization all over the place. It's really rare that these ugly `likely` macros produce any perceivable performance benefit. – R.. GitHub STOP HELPING ICE Feb 07 '11 at 00:27
  • 1
    @R.., If you think that likely/unlikely can't give any benefit, you cat open a new Question. But this question was opened upon request from my friend. He is a CS teacher of first-year students and he wanted to show them a likely/unlikely example. He wasn't able to produce a example, so I asked this q. I don't know is this example required by course or he adds it by himself. But the goal of this is to give basic understanding of likely to students. – osgx Feb 07 '11 at 00:34
  • 3
    What kind of first-year CS instructor teaches students premature micro-optimization? He should be teaching them the real language first, and higher-level efficiency considerations (such as working with data in-place rather than allocating copies, efficient algorithms with respect to time and space, etc.). If `likely` and `unlikely` are ever taught in a CS curriculum, it should be a senior year topic. – R.. GitHub STOP HELPING ICE Feb 07 '11 at 04:13
  • Its russia. First year of the high school. What do you mean as "senior year"? – osgx Feb 07 '11 at 16:16
  • This teacher is the coach of ACM team, which passed to half-final or even to final. So, I think, he does teach a lot of high-level optimisation. Also, this unlikely/likely was not the first lesson, but it was only a part of lesson in middle of the year. – osgx Feb 07 '11 at 16:19

2 Answers2

25

Here is the one I use, a really inefficient implementation of the Fibonacci numbers:

#include <stdio.h>
#include <inttypes.h>
#include <time.h>
#include <assert.h>

#define likely(x) __builtin_expect((x),1)
#define unlikely(x) __builtin_expect((x),0)

uint64_t fib(uint64_t n)
{
    if (opt(n == 0 || n == 1)) {
        return n;
    } else {
        return fib(n - 2) + fib(n - 1);
    }
}

int main(int argc, char **argv)
{
    int i, max = 45;
    clock_t tm;

    if (argc == 2) {
        max = atoi(argv[1]);
        assert(max > 0);
    } else {
        assert(argc == 1);
    }

    tm = -clock();
    for (i = 0; i <= max; ++i)
        printf("fib(%d) = %" PRIu64 "\n", i, fib(i));
    tm += clock();

    printf("Time elapsed: %.3fs\n", (double)tm / CLOCKS_PER_SEC);
    return 0;
}

To demonstrate, using GCC:

~% gcc -O2 -Dopt= -o test-nrm test.c
~% ./test-nrm
...
fib(45) = 1134903170
Time elapsed: 34.290s

~% gcc -O2 -Dopt=unlikely -o test-opt test.c
~% ./test-opt
...
fib(45) = 1134903170
Time elapsed: 33.530s

A few hundred milliseconds less. This gain is due to the programmer-aided branch prediction.

But now, for what the programmer should really be doing instead:

~% gcc -O2 -Dopt= -fprofile-generate -o test.prof test.c
~% ./test.prof 
...
fib(45) = 1134903170
Time elapsed: 77.530s  /this run is slowed down by profile generation.

~% gcc -O2 -Dopt= -fprofile-use -o test.good test.c
~% ./test.good
fib(45) = 1134903170
Time elapsed: 17.760s

With compiler-aided runtime profiling, we managed to reduce from the original 34.290s to 17.760s. Much better than with programmer-aided branch prediction!

osgx
  • 90,338
  • 53
  • 357
  • 513
Juliano
  • 39,173
  • 13
  • 67
  • 73
  • 1
    profile use is the good option, but I need to demonstrate `likely` and `unlikely` – osgx Apr 29 '10 at 17:14
  • 21
    aah, tm = -clock(); tm += clock(); is beautiful, I haven't seen it before, thank you! – MK. Apr 29 '10 at 18:00
  • I think what this demonstrates is that `likely` and `unlikely` are not very useful. Also perhaps that this is a really bad implementation of `fib()`... – R.. GitHub STOP HELPING ICE Feb 07 '11 at 00:29
  • R.., This is only a studying example, and it show that sometimes likely can help a bit. – osgx Feb 07 '11 at 00:35
  • 9
    Where are `likely` and `unlikely` actually used in the above code? – jon hanson Feb 22 '11 at 08:45
  • @joh hanson: you pass it as a compiler define in -Dopt= – Juliano Feb 22 '11 at 13:16
  • On my laptop: -Dopt= ==> 44.284s, -Dopt=unlikely ==> 43.833s | -Dopt= -fprofile-generate ==> 27.563s, -Dopt= -fprofile-use ==> 21.878s | -Dopt=unlikely -fprofile-generate ==> 27.245s, -Dopt=unlikely -fprofile-use ==> 21.863s (gcc version 4.8.4) I guess the benefit is there anyway, if you are writing a linux scheduler... – kavadias Feb 05 '16 at 14:48
-2

From this blog post. I think likely and unlikely are mostly obsolete. Very cheap CPUs (ARM Cortex A20 in the example) have branch predictors and there is no penalty regardless of jump is taken / jump is not taken. When you introduce likely/unlikely the results will be either the same or worse (because compiler has generated more instructions).

Bogi
  • 2,274
  • 5
  • 26
  • 34