2

I know how to sort arrays in javascript by using Array.sort(), but I don't know what is really happening at each iteration in the below code. In this code at certain points some duplicate elements begin to appear in the array. I've searched many websites for an explanation to this mystery, but I don't understand why this is happening. Run this code in chrome Browser. I guess its running on browser's native code. But I'd like to know the logic behind this.

I know the sorting is based on the returning of positive negative and zero values, but why do duplicate elements appear?

var arr = [4, 3, 2, 1];
arr.sort(function(a, b) {
  document.getElementById("output").innerHTML += ("[" + arr + "] " + 'a = ' + a + ' b = ' + b + "<br>");
  return a - b;
});
document.getElementById("output").innerHTML += "[" + arr + "]";
<div id="output"></div>
vijay
  • 926
  • 7
  • 15

2 Answers2

3

The short answer is this:

With any in place algorithm that operates on an array, there are no guarantees that intermediate values of the array are valid or complete. The way Chrome has chosen to implement their sort algorithm, there are intermediate steps where the array has values duplicated and is not a complete set of the original array, but this is perfectly acceptable because the sort algorithm is not finished yet.

TLDR version:

If you want to know the exact logic behind this, you can look at the source code for the V8 javascript engine that Chrome uses here https://github.com/v8/v8/blob/master/src/js/array.js

In particular for this case

For short (length <= 22) arrays, insertion sort is used for efficiency.

//This function was copied from the link above
function InsertionSort(a, from, to) {
  for (var i = from + 1; i < to; i++) {
    var element = a[i];
    for (var j = i - 1; j >= from; j--) {
      var tmp = a[j];
      var order = comparefn(tmp, element); //This is where your comparison function is called
      if (order > 0) {
        a[j + 1] = tmp;
      } else {
        break;
      }
    }
    a[j + 1] = element;
  }
};

Stepping through this to the first duplicate...

1st comparison: array before comparison [4,3,2,1]

i=1, j=0, tmp=a=4, element=b=3, order=1, arr[1]=tmp=a=4, arr[0]=element=b=3

2nd comparison: array before comparison [3,4,2,1]

i=2, j=1, tmp=a=4, element=b=2, order=2, arr[2]=tmp=a=4

3rd comparison: array before comparison [3,4,4,1]

i=2, j=0, tmp=a=3, element=b=2(this is still in memory from the outer for loop), order=1, arr[1]=tmp=a=3, arr[0]=element=b=2

4th comparison: array before comparison [2,3,4,1]...

From this you can see why values are briefly duplicated in the array because of how the for loops are set up in the insertion sort. It is set up to write to the array a minimal amount of times and search values are kept in memory, so intermediate values of the array do not need to be valid.

TJ Rockefeller
  • 3,178
  • 17
  • 43
2

Because default JavaScript sorting is not O(n).

It is most likely sorting with Heapsort or Mergesort (Firefox's SpiderMonkey engine) which both have a worst-case time complexity of O(n log n) Each browsers' engine uses its own variation of one of these sorting algorithms.

C++ integer sorting uses a Quicksort implementation. Which has a best-case of O(n log n) and a worst-case of O(n^2)

Anything with a time complexity above O(n) will start to show more comparisons than the total number of items.

Note: The default sorting algorithms have been reported to be unstable in the past and it is recommended that you may want to implement your own.

Edit

The values may appear duplicated in the array because you are printing the array DURING the sorting. If you notice the b value is the one missing, so it is being stored in a temporary swap variable while being compared to the other values.

Here is a step-wise debug of a Quicksort implementation. You can see what is going on during the critical steps.

String.repeat = function(str, n) {
  return Array.apply(null, Array(n)).map(String.prototype.valueOf, str).join('');
};
var Debugger = {};
Debugger.debugEnabled = true;
Debugger.indentChar = '&emsp;';
Debugger.println = function(label, indentLevel, rest) {
  if (Debugger.debugEnabled === true) {
    document.body.innerHTML += [
      String.repeat(Debugger.indentChar, indentLevel) +
      '<strong>',  label, ':</strong> ' +
      [].slice.call(arguments, 2).join(' ') +
      '<br />'
    ].join('');
  }
};
var QuickSort = {};
QuickSort.sort = function(items) {
  Debugger.println('Initial Array', 0, items); // Print initial array.
  return QuickSort._quickSort(items, 0, items.length - 1);
};
QuickSort._quickSort = function(items, left, right) {
  var index;
  Debugger.println('Before Sort', 1, items); // Print array before sort.
  if (items.length > 1) {
    index = QuickSort._partition(items, left, right);
    if (left < index - 1) {
      QuickSort._quickSort(items, left, index - 1);
    }
    if (index < right) {
      QuickSort._quickSort(items, index, right);
    }
    Debugger.println('After Sort', 1, items); // Print array after sort.
  }
  return items;
};
QuickSort._swap = function(items, firstIndex, secondIndex) {
  var temp = items[firstIndex];
  items[firstIndex] = items[secondIndex];
  Debugger.println('During Swap', 3, items); // Print array during the swap.
  items[secondIndex] = temp;
};
QuickSort._partition = function(items, left, right) {
  var pivot = items[Math.floor((right + left) / 2)], i = left, j = right;
  while (i <= j) {
    while (items[i] < pivot) {
      i++;
    }
    while (items[j] > pivot) {
      j--;
    }
    if (i <= j) {
      Debugger.println('Comparing', 2, items[i], ' (' + i + ') ', items[j], ' (' + j + ')');
      QuickSort._swap(items, i, j); // Print swapped array values.
      i++;
      j--;
    }
  }
  return i;
};

var array = [4, 3, 2, 1];
var result = QuickSort.sort(array);

Debugger.println('Result Array', 0, result);  // Print result array.
Mr. Polywhirl
  • 42,981
  • 12
  • 84
  • 132
  • 2
    Most standard implementations of sort use insertion sort once the array size is around less than 14 since most `O(n log n)` algorithms use recursive calls. Also, I think you missed the question's point. – Joshua Dawson Dec 28 '16 at 14:48
  • Not really... *"Anything with a time complexity above O(n) will start to show more comparisons than the total number of items."* — meaning that you will see multiple comparisons printing out. – Mr. Polywhirl Dec 28 '16 at 14:51
  • 1
    Yeah, definitely, but OP was wondering why numbers show up multiple times when you run the code snippet. See the third array [3,4,4,1] even though the original array was [4,3,2,1] (running chrome) – Joshua Dawson Dec 28 '16 at 14:55
  • @JoshDawson I added an example using Quicksort to show why the array may have duplicates. – Mr. Polywhirl Dec 28 '16 at 15:54