TLDR: I want to give runtime branch prediction hits for x86-64, ideally if compiled by MSVC without asm, for a branch that is based on random data, by peeking into that data. Is it possible?
Assume sequentially interpreting a byte stream, where variable-sized StructureA
and StructureB
occur, distinguished by some bit patterns, and they occur randomly with approximately equal probability.
They have different functions to interpret them, so there's a branch dependent on data.
As CPU will not be able to predict a random pattern, mispredictions are expected to delay execution.
I see at least two ways I could provide information for branching in advance:
- By peeking forward into the stream
- By capturing bits, then by starting processing independently both
StructureA
andStructureB
, interleaving code, hoping that out of order superscalar execution make the lines of code where I processStructureA
andStructureB
executing simultaneously, then branching, and discarding results for the wrong structure.
I know there are (exotic) architectures which always employ delayed branching instead of branch prediction, so at least the second option would have worked.
But is there a way to give such branch prediction hint on usual x86-64 ?
I'm looking for machine instruction or something like this.
Although if such thing exist, I ideally want it to compile as C++ code. MSVC, x64, Windows 10, if these details matter.
As it is apparently not possible, do following workarounds make sense:
- Always alternate branches, when there are consecutive
StructureA
, still process fakeStructureB
, so that the pattern is predictable. - Always take both branches, discard results of wrong branch by
cmov
- Make some static hint with some sort of
[[likely]]
towards longer branch, assuming they are not equally fast.
By make sense I mean whether they can they benefit, i.e. produce better throughput than control flow with unpredictable branches?