59

I have an array like;

["IL0 Foo", "PI0 Bar", "IL10 Baz", "IL3 Bob says hello"]

And need to sort it so it appears like;

["IL0 Foo", "IL3 Bob says hello", "IL10 Baz", "PI0 Bar"]

I have tried a sort function;

function compare(a,b) {
  if (a < b)
     return -1;
  if (a > b)
    return 1;
  return 0;
}

but this gives the order

["IL0 Foo", "IL10 Baz", "IL3 Bob says hello", "PI0 Bar"]

I have tried to think of a regex that will work but can't get my head around it.
If it helps the format will always be 2 letters, x amount of numbers, then any number of characters.

georg
  • 211,518
  • 52
  • 313
  • 390
Rooneyl
  • 7,802
  • 6
  • 52
  • 81
  • Letter first, then number? – Brad Christie Mar 18 '13 at 14:16
  • @BradChristie, yes sort by first two letters then the numbers, the others characters not required (but would be nice) – Rooneyl Mar 18 '13 at 14:17
  • So if you had `IL10 Hello` & `IL10 Bob`, which comes first? – Brad Christie Mar 18 '13 at 14:18
  • in an ideal world IL10 Bob – Rooneyl Mar 18 '13 at 14:19
  • Possible Duplicate: [How may I sort a list alphabetically using jQuery?](http://stackoverflow.com/q/1134976/299327) – Ryan Gates Mar 18 '13 at 14:22
  • What exactly does "IL10" represent? could that be passed as an array of two vales or an object rather than a concatenated string? It appears as though you want to sort alphabetically first, then numerically. To do that you'll have to split the string into parts and compare that way. In an alphabetic sort(the way you are doing it) `IL10` will always come before `IL3` – Kevin B Mar 18 '13 at 14:23
  • 1
    @RyanGates Except this isn't an alphabetical ordering and has nothing to do with elements in the DOM. – Anthony Grist Mar 18 '13 at 14:23
  • can there be a second number or is there always just one? example: "IL3 Bob says 55" – Jan Hommes Mar 18 '13 at 14:31
  • 1
    Does this answer your question? [Javascript : natural sort of alphanumerical strings](https://stackoverflow.com/questions/2802341/javascript-natural-sort-of-alphanumerical-strings) – Ivar Sep 05 '21 at 10:21

8 Answers8

115

This is called "natural sort" and can be implemented in JS like this:

function naturalCompare(a, b) {
    var ax = [], bx = [];

    a.replace(/(\d+)|(\D+)/g, function(_, $1, $2) { ax.push([$1 || Infinity, $2 || ""]) });
    b.replace(/(\d+)|(\D+)/g, function(_, $1, $2) { bx.push([$1 || Infinity, $2 || ""]) });
    
    while(ax.length && bx.length) {
        var an = ax.shift();
        var bn = bx.shift();
        var nn = (an[0] - bn[0]) || an[1].localeCompare(bn[1]);
        if(nn) return nn;
    }

    return ax.length - bx.length;
}

/////////////////////////

test = [
    "img12.png",
    "img10.png",
    "img2.png",
    "img1.png",
    "img101.png",
    "img101a.png",
    "abc10.jpg",
    "abc10",
    "abc2.jpg",
    "20.jpg",
    "20",
    "abc",
    "abc2",
    ""
];

test.sort(naturalCompare)
document.write("<pre>" + JSON.stringify(test,0,3));

To sort in reverse order, just swap the arguments:

test.sort(function(a, b) { return naturalCompare(b, a) })

or simply

test = test.sort(naturalCompare).reverse();
georg
  • 211,518
  • 52
  • 313
  • 390
  • 1
    Seems to favor strings that start with alphabetical characters over numeric characters. Fixed by replacing `$1 || 0` with `typeof($1) == 'undefined' ? Infinity : $1` – colllin May 04 '14 at 07:30
  • 2
    Could anyone help how this can be implemented for both ascending and descending sorting of array ? – anusreemn Mar 18 '15 at 11:51
  • Please note the `.reverse ` will iterate the array again. – Rafael Herscovici Feb 23 '17 at 02:34
  • Can someone explain with some detial exactly what is happening with the regex replaces / Infinity? Then the while loop with shift mechanisms? I don't fully understand what is happening there – DRobertE Mar 27 '18 at 14:06
  • I just wanted to ask, for time-space complexity/Big O, this would be O(n) right? – Malcolm Salvador Oct 17 '21 at 19:46
  • If you *also* want the sort to be case-insensitive, the line that assigns `nn` can be replaced with `var nn = (an[0] - bn[0]) || an[1].localeCompare(bn[1], undefined, {sensitivity: 'base'});` – Tim Angus Mar 07 '22 at 17:27
22

You could use String#localeCompare with options

sensitivity

Which differences in the strings should lead to non-zero result values. Possible values are:

  • "base": Only strings that differ in base letters compare as unequal. Examples: a ≠ b, a = á, a = A.
  • "accent": Only strings that differ in base letters or accents and other diacritic marks compare as unequal. Examples: a ≠ b, a ≠ á, a = A.
  • "case": Only strings that differ in base letters or case compare as unequal. Examples: a ≠ b, a = á, a ≠ A.
  • "variant": Strings that differ in base letters, accents and other diacritic marks, or case compare as unequal. Other differences may also be taken into consideration. Examples: a ≠ b, a ≠ á, a ≠ A.

The default is "variant" for usage "sort"; it's locale dependent for usage "search".

numeric

Whether numeric collation should be used, such that "1" < "2" < "10". Possible values are true and false; the default is false. This option can be set through an options property or through a Unicode extension key; if both are provided, the options property takes precedence. Implementations are not required to support this property.

var array = ["IL0 Foo", "PI0 Bar", "IL10 Baz", "IL3 Bob says hello"];

array.sort(function (a,b) {
    return a.localeCompare(b, undefined, { numeric: true, sensitivity: 'base' });
});

console.log(array);
Community
  • 1
  • 1
Nina Scholz
  • 376,160
  • 25
  • 347
  • 392
  • Will it work if array contains undefined or null values? I guess it will just ignore it instead of placing them at the front or end depending on type of sort – Umang Apr 26 '22 at 03:35
  • 1
    @Umang, unfortunately `localeCompare` works only for strings. the parameter is converted to string. – Nina Scholz Apr 26 '22 at 07:36
5
var re = /([a-z]+)(\d+)(.+)/i;
var arr = ["IL0 Foo", "PI0 Bar", "IL10 Baz", "IL3 Bob says hello"];
var order = arr.sort( function(a,b){
    var ma = a.match(re),
        mb = b.match(re),
        a_str = ma[1],
        b_str = mb[1],
        a_num = parseInt(ma[2],10),
        b_num = parseInt(mb[2],10),
        a_rem = ma[3],
        b_rem = mb[3];
    return a_str > b_str ? 1 : a_str < b_str ? -1 : a_num > b_num ? 1 : a_num < b_num ? -1 : a_rem > b_rem;  
});
epascarello
  • 204,599
  • 20
  • 195
  • 236
3

I liked georg's solution a lot, but I needed underscores ("_") to sort before numbers. Here's how I modified his code:

var chunkRgx = /(_+)|([0-9]+)|([^0-9_]+)/g;
function naturalCompare(a, b) {
    var ax = [], bx = [];
    
    a.replace(chunkRgx, function(_, $1, $2, $3) {
        ax.push([$1 || "0", $2 || Infinity, $3 || ""])
    });
    b.replace(chunkRgx, function(_, $1, $2, $3) {
        bx.push([$1 || "0", $2 || Infinity, $3 || ""])
    });
    
    while(ax.length && bx.length) {
        var an = ax.shift();
        var bn = bx.shift();
        var nn = an[0].localeCompare(bn[0]) || 
                 (an[1] - bn[1]) || 
                 an[2].localeCompare(bn[2]);
        if(nn) return nn;
    }
    
    return ax.length - bx.length;
}

/////////////////////////

test = [
    "img12.png",
    "img10.png",
    "img2.png",
    "img1.png",
    "img101.png",
    "img101a.png",
    "abc10.jpg",
    "abc10",
    "abc2.jpg",
    "20.jpg",
    "20",
    "abc",
    "abc2",
    "_abc",
    "_ab_c",
    "_ab__c",
    "_abc_d",
    "ab_",
    "abc_",
    "_ab_cd",
    ""
];

test.sort(naturalCompare)
document.write("<pre>" + JSON.stringify(test,0,3));
Chaim Leib Halbert
  • 2,194
  • 20
  • 23
3

Pad numbers in string with leading zeros, then sort normally.

var naturalSort = function (a, b) {
    a = ('' + a).replace(/(\d+)/g, function (n) { return ('0000' + n).slice(-5) });
    b = ('' + b).replace(/(\d+)/g, function (n) { return ('0000' + n).slice(-5) });
    return a.localeCompare(b);
}

var naturalSortModern = function (a, b) {
    return ('' + a).localeCompare(('' + b), 'en', { numeric: true });
}

console.dir((["IL0 Foo", "PI0 Bar", "IL10 Baz", "IL3 Bob says hello"].sort(naturalSort)));

console.dir((["IL0 Foo", "PI0 Bar", "IL10 Baz", "IL3 Bob says hello"].sort(naturalSortModern)));
axfree
  • 31
  • 3
2

You could do a regex like this to get non-numeric and numeric parts of the string:

var s = "foo124bar23";
s.match(/[^\d]+|\d+/g)

returns: ["foo", "124" , "bar" , "23"]

Then in your compare function you can iterate through the parts of the two strings comparing them part-by-part. The first non-matching part determines the result of the overall comparison. For each part, check if the part starts with a digit and if so parse it as a number before doing the comparison.

Tim Goodman
  • 23,308
  • 7
  • 64
  • 83
1

Add one more alternative (why not):

var ary = ["IL0 Foo", "PI0 Bar", "IL10 Hello", "IL10 Baz", "IL3 Bob says hello"];

// break out the three components in to an array
// "IL10 Bar" => ['IL', 10, 'Bar']
function getParts(i){
    i = i || '';
    var parts = i.match(/^([a-z]+)([0-9]+)(\s.*)$/i);
    if (parts){
        return [
            parts[1],
            parseInt(parts[2], 10),
            parts[3]
        ];
    }
    return []; // erroneous
}
ary.sort(function(a,b){
    // grab the parts
    var _a = getParts(a),
        _b = getParts(b);

    // trouble parsing (both fail = no shift, otherwise
    // move the troubles element to end of the array)
    if(_a.length == 0 && _b.length == 0) return 0;
    if(_a.length == 0) return -1;
    if(_b.length == 0) return 1;

    // Compare letter portion
    if (_a[0] < _b[0]) return -1;
    if (_a[0] > _b[0]) return 1;
    // letters are equal, continue...

    // compare number portion
    if (_a[1] < _b[1]) return -1;
    if (_a[1] > _b[1]) return 1;
    // numbers are equal, continue...

    // compare remaining string
    if (_a[2] < _b[2]) return -1;
    if (_a[2] > _b[2]) return 1;
    // strings are equal, continue...

    // exact match
    return 0;
});

jsfiddle example

Brad Christie
  • 100,477
  • 16
  • 156
  • 200
0

Not pretty, but check the first two char codes. If all equal parse and compare the numbers:

var arr = ["IL0 Foo", "IL10 Baz", "IL3 Bob says hello", "PI0 Bar"];
arr.sort(function (a1, b1) {
    var a = parseInt(a1.match(/\d+/g)[0], 10),
        b = parseInt(b1.match(/\d+/g)[0], 10),
        letterA = a1.charCodeAt(0),
        letterB = b1.charCodeAt(0),
        letterA1 = a1.charCodeAt(1),
        letterB1 = b1.charCodeAt(1);
    if (letterA > letterB) {
        return 1;
    } else if (letterB > letterA) {
        return -1;
    } else {
        if (letterA1 > letterB1) {
            return 1;
        } else if (letterB1 > letterA1) {
            return -1;
        }
        if (a < b) return -1;
        if (a > b) return 1;
        return 0;
    }
});

Example

Joe
  • 80,724
  • 18
  • 127
  • 145
  • that gives me the order ["IL0 Foo", "IL10 Baz", "IL3 Bob says hello", "PI0 Bar"]. IL3 should be the second element – Rooneyl Mar 18 '13 at 14:31