2

TLDR: I want to give runtime branch prediction hits for x86-64, ideally if compiled by MSVC without asm, for a branch that is based on random data, by peeking into that data. Is it possible?


Assume sequentially interpreting a byte stream, where variable-sized StructureA and StructureB occur, distinguished by some bit patterns, and they occur randomly with approximately equal probability.

They have different functions to interpret them, so there's a branch dependent on data.

As CPU will not be able to predict a random pattern, mispredictions are expected to delay execution.

I see at least two ways I could provide information for branching in advance:

  1. By peeking forward into the stream
  2. By capturing bits, then by starting processing independently both StructureA and StructureB, interleaving code, hoping that out of order superscalar execution make the lines of code where I process StructureA and StructureB executing simultaneously, then branching, and discarding results for the wrong structure.

I know there are (exotic) architectures which always employ delayed branching instead of branch prediction, so at least the second option would have worked.

But is there a way to give such branch prediction hint on usual x86-64 ?

I'm looking for machine instruction or something like this.

Although if such thing exist, I ideally want it to compile as C++ code. MSVC, x64, Windows 10, if these details matter.


As it is apparently not possible, do following workarounds make sense:

  1. Always alternate branches, when there are consecutive StructureA, still process fake StructureB, so that the pattern is predictable.
  2. Always take both branches, discard results of wrong branch by cmov
  3. Make some static hint with some sort of [[likely]] towards longer branch, assuming they are not equally fast.

By make sense I mean whether they can they benefit, i.e. produce better throughput than control flow with unpredictable branches?

Alex Guteniev
  • 12,039
  • 2
  • 34
  • 79
  • There are no hardware branch hints; the best you can do is software pipelining so the branch condition is ready in plenty of time. Or are you suggesting something more sophisticated, like software speculation, where you change your branch conditions to be more predictable, and sometimes branchlessly throw away wasted work? That could be a win in specific cases, and not an exact duplicate of [Avoid stalling pipeline by calculating conditional early](https://stackoverflow.com/q/49932119) – Peter Cordes May 21 '20 at 17:48
  • I've edited my question. I was considering workarounds towards making the pattern predictable, but assumed that they will not benefit over unpredictable branch. – Alex Guteniev May 21 '20 at 17:56
  • 2
    Branchless (doing both sides and then `cmov`) depends very much on how much work there is in each side, vs. predictability. Two cheap things + `cmov` is often cheaper than a branch + one cheap thing. Two slow things is often worth branching for. – Peter Cordes May 21 '20 at 17:58

0 Answers0