20

We have a macro for error-checking that goes like this:

#define CheckCondition( x ) \
    if( x ) { \
    //okay, do nothing \
    } else { \
       CallFunctionThatThrowsException(); \
    }

and normally the condition has to be true and we'd like the CPU branch prediction to always select this path, and if it happens to be false we don't really care of a misprediction - throwing an exception and massive stack unwinding will cost a fortune anyway.

According to CPU hardcore descriptions branch prediction will treat forward jumps and backward jumps slightly differently (something like a backward jump is always performed and a forward jump is never performed) and the compiler could improve branch prediction by generating code that will give right hints to the CPU branch predictor.

gcc seems to have likely and unlikely hints for that. Is there anything like that in Visual C++? Can __assume keyword be used for that?

Community
  • 1
  • 1
sharptooth
  • 167,383
  • 100
  • 513
  • 979
  • 2
    Reading the documentation it's clear `__assume` *cannot* be used, because it makes the compiler skip the other branch altogether. – Jan Hudec Feb 22 '11 at 07:23
  • Sounds like a micro optimization – BЈовић Feb 22 '11 at 07:42
  • 1
    @VJo: Well, kind of. But the intention is to change one macro and have an impact on all code that uses it. If this gives us even a slightest gain in execution speed it's not that bad. – sharptooth Feb 22 '11 at 07:46
  • 8
    @VJo Micro optimizations can be important if you're trying to save microseconds. – Crashworks Feb 22 '11 at 08:06
  • @BЈовић this does sound like a micro-optimization, but this sort of micro-optimization makes sense in many places. Consider, for example, code that performs real-time processing, or kernel code in hot paths (e.g. the I/O path). While the gcc hints will tweak the generated code, there's no way to pass hints to the executing CPU. NetBurst era P4 CPUs allowed and used hints; modern processors allow them but ignore them. The best bet is to instrument your code using profile-guided optimization and use the instrumentation data to hint the compiler on hot/cold paths. – Nik Bougalis Nov 17 '12 at 04:56
  • 1
    Just an FYI: here's an interesting article with some analysis of the effect of `__builtin_expect` in GCC: http://blog.man7.org/2012/10/how-much-do-builtinexpect-likely-and.html – Michael Burr Dec 01 '12 at 22:44
  • @NikBougalis: Can you point to docs that describe how current hints "are allowed but ignored"? I have legacy code full of these, and often wonder why I'm keeping them up to date. – Ira Baxter Dec 02 '12 at 00:40
  • @IraBaxter The __builtin_expect has two effects: one is directly on the generated code; the other is indirect, as the new code is structured so as to help the branch predictor. However, the newer generations of Intel processors have much improved branch prediction and (supposedly) only consider hints such "as take the if branch" only as primers. I can't quote you specific docs about how the predictors operate, since neither Intel nor AMD release details about their predictors and their operation, which should come as no surprise. – Nik Bougalis Dec 02 '12 at 01:04

2 Answers2

11

Not in MSVC, unfortunately, according to their developer center.

It's very frustrating because we'd like to use it in a couple of cases where the equivalent GCC intrinsic has saved us a critical few microseconds in inner loops, but the closest we can get is to swap the if and else clauses so that the more likely case is in the forward-jump-not-taken branch.

Crashworks
  • 40,496
  • 12
  • 101
  • 170
  • 3
    How can I achieve that swapping if one branch is empty? The compiler seems to emit exactly the same code for `if( !condition ) { action(); } else {}` and `if( condition ) {} else { action(); }`. – sharptooth Feb 22 '11 at 08:12
  • @sharptooth We've never had to do it for that -- our cases all had code in both blocks. I'm afraid this is just Yet Another MSVC Shortcoming. – Crashworks Feb 22 '11 at 08:14
  • So there is an implicit hint in the ordering of the conditions. So you effectively have the same ability. – Martin York Feb 22 '11 at 08:20
  • @Martin York: I could'n see any changes in order of `if-else` branches if one branch is empty. – sharptooth Feb 22 '11 at 09:35
  • 1
    @sharptooth If one of the branches is empty there is only one branch, right? The optimizer isn't stupid! – Bo Persson Feb 22 '11 at 18:43
  • @Bo Persson: Yes, it's even too smart - I seem to have no means to tell it that this only branch is unlikely to be taken. – sharptooth Feb 24 '11 at 06:17
  • @sharptooth: you shouldn't worry too much about this. Modern CPUs use past history to predict future behavior of branches, so after the first few runs the CPU will begin consistently predictingthe "all ok" path. You can *help* improve this by actually wrapping the error checking inside a function, instead of sprinking `if() { } else { }` blocks throughout the code, because the CPU tracks branching by address and if you have a million different such checks, at a million different places (i.e. addresses) you're putting pressure on the limited resources of the branch predictor. – Nik Bougalis Nov 17 '12 at 05:03
  • 2
    @NikBougalis That assumes that the CPU's branch history is large enough to remember this particular instruction every time it is called. If this error-checking macro occurs many different places in the code, the CPU will not be able to store all of them in its prediction table. – Crashworks Aug 29 '13 at 21:47
  • @Crashworks Certainly true, however modern CPUs have not only excellent branch prediction but large predictors, and very good "static" algorithms as well (for a when a branch isn't found in the history) which compilers take advantage of. Couple with profile-guided optimization features available on just about every major compiler, and worrying about micro-optimizing branch prediction just isn't sensible, if for no *other* reason because those micro-optimizations could horribly break with a future processor. The fact is that almost always there's huge optimization gains to be had elsewhere. – Nik Bougalis Aug 29 '13 at 23:21
3

Enable Profile-Guided Optimization. The compiler not only will maximize branch prediction, but may move the cold code out of the way entirely. This channel 9 video explains the various optimizations.

Bruno Martinez
  • 2,850
  • 2
  • 39
  • 47
  • PGO requires a separate pass and also I'll have to cover all those statements in my test run to have them all PGOed. – sharptooth Dec 03 '12 at 06:38
  • 1
    It's more work to setup, but also generates better code. Microsoft won't implement the likely hint, as PGO dominates it. Note that you don't want to use your tests to train PGO. Tests cover border cases. Train PGO on your common cases. – Bruno Martinez Dec 03 '12 at 21:15
  • 5
    Nope, PGO does not dominate `likely` hint - they are just different things. – sharptooth Dec 04 '12 at 07:07