3

I was trying to sort an array of objects containing around 100 large entities (having almost 30 keys ) on the basis of values of key which were deeply nested within an object and for that I was using lodash's orderBy method:

let name = (user) => user.accountDetails.name.toLowerCase();
let dob = (user) => user.personalProfile.dob;    
orderBy(cloneDeep(data), [name, dob], [sortOrder1, sortOrder2])

*considering sortOrder to be either desc or asec

But the time taken by the sort process is pretty large. What kind of faster approach we can use to sort the array of objects with the keys buried deep within the object?

Example data (consider 50 entries like these having 40 keys at least)

{
    "list": "bugs42",
    "start-date": "2015-08-27",
    "accountDetails": {
        "name": "diamond",
        "text": "8 months",
        "milliseconds": 19936427304
    }
    "personalProfile": {
        "name": "stark",
        "dob": "2003-03-12T09:26:39.980Z",
    }
},
{
    
    "list": "bugs50",
    "start-date": "2015-08-27",
    "accountDetails": {
        "name": "ruby",
        "text": "8 months",
        "milliseconds": 19936427305
    }
    "personalProfile": {
        "name": "warmachine",
        "dob": "2007-03-31T09:26:39.980Z",
    }
}
Ashish Bairwa
  • 757
  • 1
  • 6
  • 19
  • Does this answer your question? [How to sort a JavaScript array of objects by nested object property?](https://stackoverflow.com/questions/5073799/how-to-sort-a-javascript-array-of-objects-by-nested-object-property) – Heretic Monkey Apr 27 '21 at 17:10
  • 1
    You don't need to convert the string to a Date object. The string will sort naturally in order as it is in ISO 8601 format. – Heretic Monkey Apr 27 '21 at 17:12
  • No, as the .sort() internally uses a hybrid sort technique and since the objects are very large we should rather move with an approach which proceeds with minimum number of swaps and yes we can remove explicit date conversion @HereticMonkey – Ashish Bairwa Apr 27 '21 at 17:14

1 Answers1

3

1. Using JavaScript's built-in sort() functions

We can use JavaScript's built-in array sort() method which will sort this all quite quickly and nicely. It's important here to run the sort() method on a copy of the array and not the array itself if you would like the original array to stay the same. We can do this in a couple of very simple ways:

  • array.slice.sort(…)
  • [...array].sort(…)

In my example below, I opted to use spread syntax, the latter option:

const data = [{
  list: "bugs42",
  startdate: "2015-08-27",
  accountDetails: { name: "diamond", text: "8 months", milliseconds: 19936427304 },
  personalProfile: { name: "stark", dob: "2003-03-12T09:26:39.980Z" }
}, {
  list: "bugs50",
  startdate: "2015-08-27",
  accountDetails: { name: "ruby", text: "8 months", milliseconds: 19936427305 },
  personalProfile: { name: "warmachine", dob: "2007-03-31T09:26:39.980Z" }
}];

const sortByDobAsc = data => [...data].sort((a,b) => new Date(a.personalProfile.dob) - new Date(b.personalProfile.dob));

const sortByDobDes = data => [...data].sort((a,b) => new Date(b.personalProfile.dob) - new Date(a.personalProfile.dob));

console.log(sortByDobAsc(data), sortByDobDes(data));

For more I information on JavaScript's built-in sort() method, check out the MDN docs here: Array.prototype.sort()

2. Using 3rd-party sort functions

This article by Hariyanto Lim explores alternative sorting methods, and it appears there are several reputable, custom sorting algorithms you can choose from and even build upon.

The fastest from his comparisons appears to be QuickInsertionSort in Chrome and Safari, and either of the other quickSort functions in Firefox where QuickInsertionSort strangely becomes just as slow and slower in some cases than the native JS sort() method.

Here is the source code for all three alternative functions explored:

1. QuickInsertionSort()

function QuickInsertionSort(arr) {
  'use strict';

  if(!arr || 1 > arr.length) {
    return null;
  }

  var startIndex = 0, endIndex = arr.length - 1;

  // use 'stack' data structure to eliminate recursive call
  // DON'T use Array.push() and Array.pop() because slow !!!
  // so use manual indexing
  var stackLength = 0; 
  
  // use 2 arrays instead of 1 array to fasten (reduce calculation of '+= 2' and '-= 2')
  var startIndexes = [];
  var endIndexes = [];

  // variables for partitioning
  var partitionIndex, pivot, left, right, _swap_temp;

  // variables for insertion sort
  var i, j, key;

  do {
    // in my testing, I found 32 is very good choice for totally generated-random data,
    // more than 100 will cause slower speed overal.      
    if(32 >= endIndex - startIndex) {

      // even using insertionSort,
      // still need this because it still come here !!
      if(1 == endIndex - startIndex) {
        if(arr[startIndex] > arr[endIndex]) {
          _swap_temp = arr[startIndex];
          arr[startIndex] = arr[endIndex];
          arr[endIndex] = _swap_temp;
        }
      } else {
        /**************************************
        ****** start of insertion sort ********
        ***************************************/
        for(i = startIndex + 1; endIndex >= i; i++) {
          key = arr[i];
          
          // Move elements of arr[startIndex..i-1], that are 
          // greater than key, to one position ahead 
          // of their current position
          for (j = i - 1; j >= startIndex; j--) {
            if(arr[j] > key) {
              arr[j + 1] = arr[j];
              continue;
            }

            // use 'break' to avoid decreasing 'j' 
            break;
          }

          // swap
          arr[j + 1] = key;
        }
        /**************************************
        ****** end of insertion sort **********
        ***************************************/
      }

      // continue to process next data, is there any data inside stack ? 
      if(stackLength > 0) {
        // pop
        stackLength--; // reduce counter to get the last position from stack
        startIndex = startIndexes[stackLength];
        endIndex = endIndexes[stackLength];
      } else {
        // no data inside stack, so finish
        break;
      }
    } else {
      // squeeze every millisecond by put main logic here instead of separate function

      // in my testing using median_of_3 does not give better result for generated totally random data !!

      /*********************************************
      *********** start of partitioning ************
      ************* Tony Hoare *********************
      **********************************************/

      // minimize worst case scenario

      // === start of select pivot ============
      pivot = arr[startIndex];

      // try to find a different element value
      j = endIndex;
      while(pivot == arr[j] && j >= startIndex) {
        j--;
      }
      if(j > startIndex) {
        // check which element is lower? 
        // use the lower value as pivot   
        if(pivot > arr[j]) {
          pivot = arr[j];
        }
      }
      // === end of select pivot ============

      left = startIndex;
      right = endIndex;

      do {
        
        while(pivot > arr[left]) {
          left++;
        }

        while(arr[right] > pivot) {
          right--;
        }

        if(left >= right) {
          partitionIndex = right;
          break;
        }

        //swap(left, right);
        // because many swaps, so optimize to implement swap here !
        _swap_temp = arr[left];
        arr[left] = arr[right];
        arr[right] = _swap_temp;

        left++;
        right--;
      } while(true); // loop forever until break

      if(partitionIndex > startIndex) {
        // has lower partition, so process it

        if(endIndex > partitionIndex + 1) {
          // push 'right' side partition info into stack for later
          startIndexes[stackLength] = partitionIndex + 1;
          endIndexes[stackLength] = endIndex;
          stackLength++; // increase counter for NEXT slot
        }

        // prepare next loop
        // keep same value for startIndex but update endIndex
        endIndex = partitionIndex;

      } else if(endIndex > partitionIndex + 1) {
        // at this point, it means there is no 'lower' side partition but has 'higher' side partition

        // prepare next loop
        // keep same value for endIndex but update startIndex
        startIndex = partitionIndex + 1;
      }
      
      /*********************************************
      ****** end of Tony Hoare partitioning ********
      **********************************************/
    }
  } while(endIndex > startIndex);
}

2. quickSort_by_Tony_Hoare_non_recursive()

function quickSort_by_Tony_Hoare_non_recursive(arr) {
  'use strict';

  if(!arr || 1 > arr.length) {
    return null;
  }

  var arrLength = arr.length;

  var startIndex = 0, endIndex = arrLength - 1;

  // don't use Array.push() and Array.pop() because too slow
  // use 2 arrays instead of 1 to avoid unnecessary increasing and reducing stackLength
  var stackStartIndex = [], stackEndIndex = [];
  var stackLength = 0;

  var partitionIndex;

  var i, j, is_key;

  do {
    partitionIndex = partition_by_Tony_Hoare(arr, startIndex, endIndex);

    if(partitionIndex > startIndex) {
      // there is lower values to partition 

      // is there higher values?
      if(endIndex > partitionIndex + 1) { 
        // we don't do it now, push it into stack for later 
        stackStartIndex[stackLength] = partitionIndex + 1;
        stackEndIndex[stackLength] = endIndex;
        stackLength++; // increase counter for next slot
      }

      // set new parameter to partition lower values 
      endIndex = partitionIndex;
    } else if(endIndex > partitionIndex + 1) { 
      // there is no lower values, only higher value, this is worst case!
      // set new parameter for next partitioning
      startIndex = partitionIndex + 1;
    } else {
      // no valid partitioning index, so we get from stack (if any)
      if(stackLength > 0) {
        stackLength--;
        startIndex = stackStartIndex[stackLength];
        endIndex = stackEndIndex[stackLength];
      } else {
        break; // finished !
      }
    }
  } while(endIndex > startIndex);

  return arr;
}

3. quickSort_by_Nico_Lomuto()

function quickSort_by_Nico_Lomuto(arr, startIndex, endIndex) {
  // using Nico Lomuto partition scheme
  // simpler and easier to understand.    

  if(endIndex > startIndex) {

    var partitionIndex = partition_by_Nico_Lomuto(arr, startIndex, endIndex);

    // the item at partitionIndex will not be included in recursive sorting because 
    // arr[partitionIndex] >= [...lowers]
    // [...highers] >= arr[partitionIndex]

    // recursion to sort lower values
    quickSort_by_Nico_Lomuto(arr, startIndex, partitionIndex - 1);

    // recursion to sort higher values
    quickSort_by_Nico_Lomuto(arr, partitionIndex + 1, endIndex);
  }

  return arr;
}

function partition_by_Nico_Lomuto(arr, startIndex, endIndex) {
  // easier to implement and understand 

  //var pivot = arr[startIndex];

  // Lomuto partitioning has worst case if selected pivot value is LARGEST value in the range!
  // prevent worst case by carefully selecting pivot value!
  var pivot = selectPivot(arr, startIndex, endIndex, true); // true = MUST do swapping !
  
  var i = startIndex;

  // one time loop from bottom to the second from top, because pivot is the top position
  for(j = startIndex; endIndex > j; j++) {
    // is current element is smaller than or equal to pivot ?
    if(pivot >= arr[j]) {
      // swap 
      swap(arr, i, j);

      i++;
    }
  }

  // swap
  swap(arr, i, endIndex);

  return i;
}

function selectPivot(arr, startIndex, endIndex, doSwap) {
  // find a pivot value which not the lowest value within the range 
  
  // Get 2 UNIQUE elements, if failed then it means all elements are same value.

  var pivot = arr[startIndex]; // get first element from the first position

  // try to find a different element value
  var j = endIndex;
  while(pivot == arr[j] && j >= startIndex) {
    j--;
  }
  if(startIndex > j) {
    //console.log('selectPivot(arr, ' + startIndex + ',' + endIndex + '), all elements are equal, nothing to sort');
    return pivot;
  }

  // check which element is lower? 
  // use the lower value as pivot and swap the position with the last position (endIndex)   
  if(pivot > arr[j]) {
    pivot = arr[j];
    if(doSwap) {
      swap(arr, j, endIndex);
    }
  } else {
    if(doSwap) {
      swap(arr, startIndex, endIndex);
    }
  }

  return pivot;
}

function swap(arr, a, b) {
  // replace more than 1 element value in array using 1 line
  // this ability is 'ES6 destructuring swap',
  // only specific for Javascript language
  // but VERY VERY SLOW, almost 3 times slower !
  //[arr[a], arr[b]] = [arr[b], arr[a]];

  // normal way for many programming language
  var _swap_temp = arr[a];
  arr[a] = arr[b];
  arr[b] = _swap_temp;
}
Brandon McConnell
  • 5,776
  • 1
  • 20
  • 36
  • Lodash is also using javascript's builtin .sort method, but i'm looking for something much faster than that for this specific case. I was rather thinking to flatten the objects and then perform a parent level sort. But flatten is also costlier and would lead to much greater time-complexity – Ashish Bairwa Apr 27 '21 at 16:04
  • @Stark I see. Reading through [this article](https://quick.work/?page=view-blog&id=24), it looks like there are several custom sorting algorithms you can choose from and even build upon. The fastest from their comparisons appears to be `QuickInsertionSort` in Chrome and Safari, and either of the other `quickSort` functions in Firefox where `QuickInsertionSort` strangely becomes just as slow and slower in some cases than the native JS `sort()` method. I'll update my answer to show this as well. – Brandon McConnell Apr 27 '21 at 16:50
  • Hmm, I have to collect some statistics regarding the performance of this hybrid sort technique for large objects. Thanks @Brandon McConnell – Ashish Bairwa Apr 27 '21 at 17:21
  • Take a look at the article if you have a chance. He shows specific statistics for each against each other and each shows how he benchmarked each against each other. – Brandon McConnell Apr 27 '21 at 17:32