Why does my Bubble Sort sort reversed data far faster than random data or data with multiple identical values?

Question

So from what I understand Bubble sort should be slower at sorting reversed data than any other type. I am currently going through sorting algorithms and have implemented my own Bubble algorithm and it is about half as slow again at sorting random data and data with multiple values which are identical than reversed data (Timed with java's System.nanoTime()). I'm intrigued by this result and I can't account for it.

Here is the code for my algorithm:

public static void bubbleSort(int[] arr)
{
    //Will only check elements which haven't been sorted yet. checks and newChecks will handle this.
     int checks = arr.length, newChecks;

     //Continues comparing elements until a swap does not occur, indicating a sorted list.
     do
     {
        newChecks = 0;

        for (int i = 1; i < checks; i++)
        {
            if (arr[i - 1] > arr[i])
             {
                swapElements(arr, i - 1, i);

                //Gives newChecks the position of the element just switched
            newChecks = i;
            }
        }

        //The last switched element indicates after which point the array is sorted. Thus the assigning of newChecks to checks.
        checks = newChecks;
    } while (checks > 0);
}

Accumulate the number of loop iterations and number of swaps that the code makes, then run the algorithm against random data and reverse-sorted data, and print the results. — jarmod, Nov 25 '13 at 02:09
I'm also confused with this one. I thought I write something wrong. This question really helps me. — Elec, Jun 30 '22 at 16:05

score 3 · Accepted Answer · edited May 23 '17 at 11:49

I suspect this is due to branch prediction.

When your data is sorted in reverse order, the result of the comparison if (arr[i - 1] > arr[i]) will always be true initially, but as the sorting gets under way, there will be a growing number of properly sorted items at the end of the data list. In the sorted part of the data, this comparison will always return false.

As a result, a simple branch prediction heuristic (i.e., the assumption that the result of each comparison will be the same as the result of the previous comparison) will produce correct results every time except twice per loop — once at the very beginning of the data and once at the point where the unsorted data meets the sorted data.

By predicting the outcome of this comparison in advance, your processor can use its pipeline to process more iterations of the loop simultaneously, and finishes sorting earlier than it would have done with randomly ordered data.

(You should still avoid using bubble sort, however.)

Excellent, that makes perfect sense. I just did it as part of a personal project. Not actually going to be properly sorting anything with it. — Zach Beavon-Collin, Nov 25 '13 at 04:45

Why does my Bubble Sort sort reversed data far faster than random data or data with multiple identical values?

1 Answers1