3

I am comparing the running time of Counting sort with Java native Arrays.sort. Form what I've read the Counting sort offers a best, average and worst case n+k running time. Javas Arrays sort of primitives, using a dual - pivot Quicksort, is a comparison based algorithm thus must offer a O(n log n) in the average case, and On2 worst case.

When comparing the two by measuring time (nanoseconds) taken to sort a series of arrays ranging in size 500 to 100k, I noted a sharp increase in running time for the Counting sort when the size reached ~70k.

My understanding is the Counting sort is efficient as long as the range of input data is not significantly greater then the number of elements to be sorted The arrays are built from random numbers between 0 and 99, so k will always be much smaller than n.

Would there be any particular reason the Counting sort would degenerate so abruptly as n increases ?

Running time in nanoseconds (y) vs Array size (x)

My counting sort implementation:

public static int[] countSort(int[] arr, int k) {
        /*
         * This will only work on positive integers 0 to K.
         * For negative  worst case testing we use the negative count sort below.
         * 
         * Step 1: Use an array to store the frequency of each element. For array
         * elements 1 to K initialize an array with size K. Step 2: Add elements of
         * count array so each element stores summation of its previous elements. Step
         * 3: The modified count array stores the position of elements in actual sorted
         * array. Step 5: Iterate over array and position elements in correct position
         * based on modified count array and reduce count by 1.
         */

        int[] result = new int[arr.length];
        int[] count = new int[k + 1];
        for (int x = 0; x < arr.length; x++) {
            count[arr[x]]++;
        }

        /*
         * Store count of each element in the count array Count[y] holds the number of
         * values of y in the array 'arr'
         */

        for (int y = 1; y <= k; y++) {
            count[y] += count[y - 1];
        }

        /*
         * Change count[i] so that count[i] now contains actual Position of this element
         * in result array
         */

        for (int i = arr.length - 1; i >= 0; i--) {
            result[count[arr[i]] - 1] = arr[i];
            count[arr[i]]--;
        }

        System.out.println("COUNTSORT SORTED ARRAY = " + Arrays.toString(result));
        return result;

    }

Resolution: Running the Counting sort in place as per @Alex's suggestion resulted in a far more superior run time.

Run time of modified in-place count sort vs Java primitive sort

dancingbush
  • 2,131
  • 5
  • 30
  • 66
  • 1
    Hey! Is the question why the spike occurs or is the question why Javas sort outperforms the custom one? *FYI:* Note that Javas sort is highly optimized and probably uses native system methods. You can't beat that with "normal" code. I've tried. – akuzminykh Apr 18 '20 at 18:14
  • Java doesn’t use quick sort anymore. It uses Tim sort. Or is that a double pivot quick sort? – Axel Apr 18 '20 at 18:16
  • 1
    What are the specs of the hardware you are using for the test? – Axel Apr 18 '20 at 18:23
  • Hi, its a 2.9Ghz i7 8 GB 1600 MHz DDR3 MacBook pro – dancingbush Apr 18 '20 at 18:51
  • Possibly related: [How do I write a correct micro benchmark in java](https://stackoverflow.com/questions/504103/how-do-i-write-a-correct-micro-benchmark-in-java) – meriton Apr 18 '20 at 19:22
  • Very useful, thanks – dancingbush Apr 18 '20 at 19:30
  • We cannot explain (or even reliably reproduce) your results without seeing the complete source code of your benchmarks. But >>my<< guess is that the explanation for the huge jump in the graph is that there is a flaw in your benchmarking technique. Various JVM warmup effects could cause that. – Stephen C Apr 19 '20 at 07:16

1 Answers1

2

Just a guess, but your sorting algorithm uses much more memory than Java’s. 70k of ints are 280KB. You need double the space, more than 512KB. Depending on the processor used, that could make the difference between running the sort in (L1?) cache and having lots of cache misses. Since you don’t really need the copy, do the sort in place. If you now hit the wall later, you have the answer.

Edit: it’s 280KB.

Edit2: It was late yesterday, so here comes the in-place version. Note that it modifies the input array.

public static int[] countSortRefactored(int[] arr, int k) {
    int[] count = new int[k + 1];
    for (int x = 0; x < arr.length; x++) {
        count[arr[x]]++;
    }

    int idx=0;
    for (int x=0; x<=k; x++) {
        Arrays.fill(arr, idx, idx+=count[x], x);
    }

    System.out.println("COUNTSORT SORTED ARRAY = " + Arrays.toString(arr));
    return arr;
}
Axel
  • 13,939
  • 5
  • 50
  • 79
  • Hi Axel, its an intel i7 dual core with L2 256kb per core and L3 4MB. Could the performance degradation be down to a switch from L2 to L3? – dancingbush Apr 18 '20 at 19:02
  • Yes, that’s what I think. – Axel Apr 18 '20 at 19:04
  • You mentioned performing the sort in place, how is this achieved? – dancingbush Apr 18 '20 at 19:23
  • 2
    Good point about cache misses; since quicksort has a near sequential access pattern, it should experience far fewer cache misses than countingsort which does random access all over the place. That likely explains why quicksort is faster even though it executes significantly more instructions. – meriton Apr 18 '20 at 19:37
  • Please try the example code I just provided. I'm really interested what difference this makes and how it compares to Java's Timsort in your case. – Axel Apr 19 '20 at 07:08
  • @Axel if i run your code with test array { 26, 14, 15, 16, 3, 13 } and call it with arguments count(array, 26) the output is missing the largest integer: [3, 13, 14, 15, 16, 13] – dancingbush Apr 19 '20 at 12:23
  • I tested with 70k random ints and both implementations had identical results. I’m on the road now. Will check when I come home. – Axel Apr 19 '20 at 12:26
  • Ah, just saw it. it should be x<= k in the loop condition. Sorry. – Axel Apr 19 '20 at 12:29
  • @Axel see the edited answer the run time shows marked improvement, thanks again – dancingbush Apr 19 '20 at 16:03