1

I came across a post on branch prediction (Why is it faster to process a sorted array than an unsorted array?) and it got me thinking about my own Monte Carlo simulations. For the sake of specificity, let's say we are working in C and writing for CPUs. (Does that matter?) Suppose you have a loop that goes something like this:

while ( condition ) {

    // dE will be stochastic...
    dE = MonteCarloMove();

    // ...and d is a random number between 0 and 1
    d = RandomRealNumber(0,1);

    w = exp(-dE);
    if ( dE < 0 || d < w ) {
        AcceptMonteCarloMove();
    }
    else {
        RejectMonteCarloMove();
    }
}

It seems to me that branch prediction at the if..else statement is completely useless, since it is random which way we go. In a properly designed Monte Carlo simulation, you'll have roughly as many accepts as rejections. So here are my questions:

  • Is there any technique one could use to avoid the branching slowdown in a random process like this?
  • If so, is it likely to provide any significant speed-up, considering that MonteCarloMove() probably involves some expensive numerical calculations?

Update: In case this general question is unanswerable, or the answer is generally no, suppose we can make some assumptions about the functions inside the if statement:

  • AcceptMonteCarloMove() is a function that updates the current state of the simulation. All it likely needs to do is
    • Copy values from one variable to another;
    • Perform simple arithmetic (addition/subtraction).
  • RejectMonteCarloMove() needs to reset the current state and possibly some bookkeeping variables. Likely it only needs to
    • Copy values from one variable to another.
Community
  • 1
  • 1
Marco Tompitak
  • 648
  • 1
  • 5
  • 12
  • 2
    You added three very different languages. Pick one. Especially for Java things can be very different here. – too honest for this site Nov 03 '15 at 12:46
  • That's one thing I wanted to know. I restricted the question to C. – Marco Tompitak Nov 03 '15 at 12:47
  • The answer strongly depends on what do `AcceptMontecarloMove` and `RejectMonteCarloMove` actually do. If they are sufficiently simple, there are techniques for removing this branching. – matb Nov 03 '15 at 12:57
  • 1
    Note: the function calling overhead is larger than the failing branch prediction. – joop Nov 03 '15 at 12:59
  • @joop OK, but these functions may be inlined and just assigning constants to a global variable. – matb Nov 03 '15 at 13:02
  • 1
    @matb: or a conditional jsr is used. There are too many variables here. The question is far too broad. – too honest for this site Nov 03 '15 at 13:10
  • @matb Thanks for the comments. I added some assumptions that could reasonably be made about the functions. – Marco Tompitak Nov 03 '15 at 13:36
  • @joop That's already an interesting fact. Wouldn't that overhead be optimized out by a compiler, though? – Marco Tompitak Nov 03 '15 at 13:36
  • The cure for a poor branch prediction is a conditional move. That needs to start by generating a *bool* and an UpdateMonteCarloMove(bool) function. With unspecified magic in that function that can take advantage of conditional moves. The conditional operator is likely to generate it. Pretty low odds that such a rather heavy-handed micro-optimization produces a measurably better outcome, you only save a handful of nanoseconds at best and that easily drowns in the time needed to execute the update function. – Hans Passant Nov 03 '15 at 13:43
  • That single branch seems to be very insignificant compared to the amount of processing in the functions. If, and only if, you have found a performance problem, you should go look at the largest performance culprits, not at insignificant micro-optimizations. For example, replacing float numbers with integers would probably give a significant performance boost. – Lundin Nov 03 '15 at 13:44
  • Also: the for functions inside the loop will contain conditions too. At least the move generator. Maybe the accept/deny choice is not even a real choice; both legs update the current state(s), so maybe they could be combined. – joop Nov 03 '15 at 13:46
  • Thanks for all the comments. They go a long way towards answering the question. If anyone wants to write them up in answer-form I'll happily accept it. – Marco Tompitak Nov 04 '15 at 16:18

0 Answers0