60

By switching a javascript sort function from

myArray.sort(function (a, b) {
  return a.name.localeCompare(b.name);
});

to

myArray.sort(function (a, b) {
  return (a.name < b.name ? -1 : (a.name > b.name ? 1 : 0));
});

I was able to cut the time to sort a ~1700 element array in Chrome from 1993 milliseconds to 5 milliseconds. Almost a 400x speedup. Unfortunately this is at the expense of correctly sorting non-english strings.

Obviously I can't have my UI blocking for 2 seconds when I try to do a sort. Is there anything I can do to avoid the horribly slow localeCompare but still maintain support for localized strings?

Preview
  • 35,317
  • 10
  • 92
  • 112
Brad Dwyer
  • 6,305
  • 8
  • 48
  • 68
  • 2
    Consider spinning off a web worker to do the `localeCompare` based sort asynchronously. You might find that the time spent serializing and deserializing that amount of data outweighs the benefits of doing it async, but it's worth a shot. – Matt Ball Feb 03 '13 at 20:45
  • That'd probably work but 2 seconds is still really slow to show results. – Brad Dwyer Feb 04 '13 at 22:22
  • You could consider a different approach - like keeping the list sorted from the start, so you never need to explicitly sort it. Where does the data come from? There are some self-sorting data structures for JavaScript already implemented: http://stackoverflow.com/a/5309821/139010 or http://stackoverflow.com/a/3809836/139010 – Matt Ball Feb 04 '13 at 22:44
  • It comes from Facebook. We ended up preloading it and sorting it before they needed to access it. – Brad Dwyer Feb 05 '13 at 23:49
  • 1
    @MattBall, web workers don't seem to handle localeCompare the same way as the rest of the code. See [this question](http://stackoverflow.com/questions/16550774/locale-string-comparison-does-not-work-properly-in-firefox-extension-web-worker) – redbmk Jun 21 '13 at 22:48
  • 2
    note that localeCompare is not case-sensitive (or maybe it depends on users locale? on my pc set to en_US it is not case sensitive). Your replacement code is case sensitive, so "Foo" comes before "bar" – Kip Sep 13 '18 at 14:05

5 Answers5

56

A great performance improvement can be obtained by declaring the collator object beforehand and using it's compare method. EG:

const collator = new Intl.Collator('en', { numeric: true, sensitivity: 'base' });
arrayOfObjects.sort((a, b) => {
  return collator.compare(a.name, b.name);
});

NOTE: This doesn't work ok if the elements are floats. See explanation here: Intl.Collator and natural sort with numeric option sorts incorrectly with decimal numbers

Here's a benchmark script comparing the 3 methods:

const arr = [];
for (let i = 0; i < 2000; i++) {
  arr.push(`test-${Math.random()}`);
}

const arr1 = arr.slice();
const arr2 = arr.slice();
const arr3 = arr.slice();

console.time('#1 - localeCompare');
arr1.sort((a, b) => a.localeCompare(
  b,
  undefined, {
    numeric: true,
    sensitivity: 'base'
  }
));
console.timeEnd('#1 - localeCompare');

console.time('#2 - collator');
const collator = new Intl.Collator('en', {
  numeric: true,
  sensitivity: 'base'
});
arr2.sort((a, b) => collator.compare(a, b));
console.timeEnd('#2 - collator');

console.time('#3 - non-locale');
arr3.sort((a, b) => (a < b ? -1 : (a > b ? 1 : 0)));
console.timeEnd('#3 - non-locale');
Andy
  • 6,869
  • 2
  • 31
  • 24
  • 4
    @BradDwyer, I edited the answer to include the benchmark script. – Andy Sep 19 '18 at 08:28
  • Nice! Only 15x slower for me on Chrome 69 compared with 800x slower for the localeCompare version. (1.70ms non-locale, 25.96ms collator, 1380.65ms localeCompare) – Brad Dwyer Sep 19 '18 at 15:19
  • @Andy, there's a bug in that code. Test #3 reuses `arr2`, which should already be sorted by the previous test, so the third test is artificially faster. Better to call `arr.slice().sort(...)` in each test. But more importantly, #3 should call `toLocaleLowercase()` on the params to give a fair comparison. Otherwise, it produces a different sort order than the first two tests. – jdunning Oct 28 '18 at 01:27
  • @junning I fixed the 3rd array typo. It's still as fast. The point of the 3rd test case was to outline the non-locale compare. It certainly won't return the same results, but it might be sufficient for some use case that doesn't require locale (eg: sorting a list of English names) – Andy Oct 29 '18 at 14:53
  • Sorting a particular 500 item array went from 40 seconds to <1 second in IE 11 – Karlth Oct 09 '19 at 11:17
14

An effective Approach that I found when dealing with /mostly/ latin characters is to use the operator whenever both strings match a specific regex. EG: /^[\w-.\s,]*$/

It's much faster if both strings match the expression, and at worst it seems to be slightly slower than blindly calling localeCompare.

Example here: http://jsperf.com/operator-vs-localecompage/11

Update: it seems like Intl.Collator is currently the best option for performance across the board: https://jsperf.com/operator-vs-localecompage/22

Jamie Pate
  • 1,783
  • 20
  • 18
  • Absolutely perfect for me, and deserves more upvotes! My dataset is 99% unaccented, so your no_locale regex makes a huge difference. – Codemonkey Jan 12 '16 at 09:44
  • Can you explain what the regex does? – Stijn de Witt Jan 16 '17 at 11:26
  • The regex detects whether the string contains only alphanumeric characters. \w Matches any alphanumeric character including the underscore. Equivalent to [A-Za-z0-9_]. LocaleCompare is irrelevant for these characters (in most cases?) – Jamie Pate Jan 26 '17 at 18:28
  • `LocaleCompare` is not irrelevant for alphanumeric characters, because regular comparison sorts all uppercase characters ahead of lowercase characters. Your jsperf tests call `toLowerCase()` before calling `localeCompare`. That's an invalid perf test. When using `localeCompare` you should not use `toLowerCase()`. – gilly3 Sep 23 '19 at 20:15
  • 1
    Localecompare is/was so many orders of magnitude slower that the toLowerCase is largely irrelevant. I recently redid the benchmarks and Intl.Collator beats the faster regex shortcut version these days anyways. https://jsperf.com/operator-vs-localecompage/22 – Jamie Pate Sep 24 '19 at 02:43
6

It's difficult to know the fastest sort without seeing the data you are sorting. But jsperf has lots of good tests showing the performance differences between types of sorting: http://jsperf.com/javascript-sort/45 http://jsperf.com/sort-algorithms/31

However none of these account for localised strings, and i'd imagine there is no easy way to sort localised strings and localeCompare is probably the best solution for this.

Looking at mozilla reference is says: "When comparing large numbers of strings, such as in sorting large arrays, it is better to create an Intl.Collator object and use the function provided by its compare property." https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/localeCompare

But going to the Intl.Collator reference it shows that is not support for firefox/safari https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Collator

you could try using some of the options on localCompare to speed up the performance. But I've just done a quick test changing the sensitivity level and it seems like it won't improve the performance:

list.sort(function(a, b) {
  return a.localeCompare(b, {sensitivity:'base'});
});

http://jsperf.com/sort-locale-strings

Preview
  • 35,317
  • 10
  • 92
  • 112
Kim T
  • 5,770
  • 1
  • 52
  • 79
  • 1
    >> it is better to create an Intl.Collator object and use the function provided by its compare property - absolutely agree. I've made some measurements and yes, the compare speed is much more higher 16ms vs 25sec with localCompare on 1000 rows – Serge Feb 23 '15 at 13:19
2

Try sorting it in 2 steps:

  1. With the operator: as you said, it will be 400 times faster
  2. Then with localCompare(): this has now less comparisons to do because the array is mostly sorted.

Note: I think that localCompare() will mostly be called with at least 1 string that is not english. So the number of calls to localCompare() with 2 english strings should be greatly reduced.

Here is the code:

myArray.sort(function(a, b) {
  return (a.name < b.name ? -1 : (a.name > b.name ? 1 : 0));
});

myArray.sort(function(a, b) {
  return a.name.localeCompare(b.name);
});

This solution has the advantage of being short and easy to use. It will be efficient if the array contains mostly english strings. The more non-english strings you have, the less useful the first sort will be. But as it is easy to add in your scripts, it is also easy to see if this approach is worthwile.

Now if I were you, I would also use an Intl.Collator, as it is said to be much faster than localCompare() when you have many comparisons to do.

Preview
  • 35,317
  • 10
  • 92
  • 112
jlgrall
  • 1,652
  • 1
  • 13
  • 13
  • 2
    Not every sorting algorithm can take advantage of an already mostly sorted array (funnily enough, for a very naive quicksort it's a disaster). No idea if those used in Javascript can. – maaartinus Mar 28 '16 at 00:36
-3

I don't know you still looking for solution to this problem

// Defaulted to ascending
// 1 asc | -1 desc
var direction = 1; 
myArray.sort(function (a, b) {
  return a.name.localeCompare(b.name) === 1 ? direction : -1 * direction;
});

i added a === 1 check to your code and this improved perf 400x that means both have comparable perf numbers.

Perf numbers with localeCompare arr size: 3200 Avg time took in 10 repetitions : 60 ms

Perf numbers with > approach. avg time took 55 ms

Preview
  • 35,317
  • 10
  • 92
  • 112
Joseph
  • 11
  • 4
  • I'm not sure how this solves the problem. Can you do a jsperf with your findings? how does ===1 improve perf 400x. – Jamie Pate Sep 10 '14 at 21:35
  • 5
    Sry, but your solution is **WRONG**: `localeCompare()` may return values different than -1, 0 or 1. Look at the [doc](http://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/localeCompare). Also, I highly doubt that adding a multiplication is faster than not having one. You should make 2 comparators: one for ascending, one for descending order. The JIT will be able to inline them much better. – jlgrall May 24 '15 at 12:06