0

I have a java code that calculate the running time for multiple sorting algorithms such as "Merge sort , Bubble sort and so on ..".

The running time after the first algorithm is not calculated correctly due to branch prediction. So is there anyway to avoid branch prediction to get the correct running time.

Example:Running time for revers sorted array with length 200000 index is as below:
Average runtime for Merge Sort in seconds after 10 iteration is : 0.0204354182
Average runtime for Bubble Sort in seconds after 10 iteration is : 1.0596160000000001E-4

As you see the bubble sort running time is not correct, it should be more than the running time for merge sort for such array.

Appreciate your help.

Sholi
  • 11
  • 5

1 Answers1

0

Probably your later iterations are sorting an already-sorted array the first iteration produced. BubbleSort is fast in that case if you use the early-out on no swaps version. MergeSort is constant time and always does the same amount of work, even for sorted input.

Sort a copy of your reversed array.

Copying around the input several times sounds about right for a factor of 20 perf difference vs. scanning through it once. (MergeSort on sorted input probably degenerates to copying all of one half, then copying all of the other half. In smaller and smaller chunks, so at some point they start fitting in L2 and then L1d cache, if we're talking about ints not strings.)

The running time after the first algorithm is not calculated correctly due to branch prediction.

That sounds unlikely. It can be different from the steady-state case, but the amount of pattern that branch prediction can "learn" and "remember" should be a small part of a 200000 sort.

More likely the first iteration is slow because of other warm-up effects like JIT compiling, and CPU frequency not yet ramping up from idle to max.

See Idiomatic way of performance evaluation?. If you're discarding the sorted copy every iteration, make sure times are still reasonable; if the optimizer is too smart, it could defeat your benchmark by simply not doing work to produce a result array that nothing ever uses.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847