How do you sort an array of strings naturally in different programming languages? Post your implementation and what language it is in in the answer.
-
1Actually, the interesting part is the comparison function which could then be used in whatever sorting algorithm you fancy. – Svante Dec 07 '08 at 19:59
-
2Reading the comments to that blog entry, it appears that natural sorting is way underdefined. – David Thornley Oct 22 '10 at 21:37
15 Answers
Here's how you can get explorer-like behaviour in Python:
#!/usr/bin/env python
"""
>>> items = u'a1 a003 b2 a2 a10 1 10 20 2 c100'.split()
>>> items.sort(explorer_cmp)
>>> for s in items:
... print s,
1 2 10 20 a1 a2 a003 a10 b2 c100
>>> items.sort(key=natural_key, reverse=True)
>>> for s in items:
... print s,
c100 b2 a10 a003 a2 a1 20 10 2 1
"""
import re
def natural_key(astr):
"""See https://blog.codinghorror.com/sorting-for-humans-natural-sort-order/"""
return [int(s) if s.isdigit() else s for s in re.split(r'(\d+)', astr)]
def natural_cmp(a, b):
return cmp(natural_key(a), natural_key(b))
try: # use explorer's comparison function if available
import ctypes
explorer_cmp = ctypes.windll.shlwapi.StrCmpLogicalW
except (ImportError, AttributeError):
# not on Windows or old python version
explorer_cmp = natural_cmp
if __name__ == '__main__':
import doctest; doctest.testmod()
To support Unicode strings, .isdecimal()
should be used instead of .isdigit()
.
.isdigit()
may also fail (return value that is not accepted by int()
) for a bytestring on Python 2 in some locales e.g., '\xb2' ('²') in cp1252 locale on Windows.

- 5,540
- 4
- 39
- 40

- 399,953
- 195
- 994
- 1,670
-
In case people find this 2008 post rather than the 2 newer Python ones, I'd like to add a caveat: 'int' works for many useful cases (e.g. image3.jpg & image10.jpg) but fails for cases like ['elm1', 'Elm2'] and ['0.501', '0.55'] and [0.01, 0.1, 1] ... see http://stackoverflow.com/questions/4836710/does-python-have-a-built-in-function-for-string-natural-sort/27430141#27430141 for lower() and my more general solution for Python natural sort order. – Scott Lawton Dec 11 '14 at 18:54
-
@ScottLawton: compare your solution with `StrCmpLogicalW` function. Read the blog post linked in the question. – jfs Dec 11 '14 at 19:46
Here's a cleanup of the code in the article the question linked to:
def sorted_nicely(strings):
"Sort strings the way humans are said to expect."
return sorted(strings, key=natural_sort_key)
def natural_sort_key(key):
import re
return [int(t) if t.isdigit() else t for t in re.split(r'(\d+)', key)]
But actually I haven't had occasion to sort anything this way.

- 37,241
- 25
- 195
- 267

- 14,921
- 5
- 53
- 53
-
Just to note that i used your code in another answer: http://stackoverflow.com/a/22685365/75554. the destination software will be up on github with attribution. get back to me if that is a problem – Nuno Furtado Apr 09 '14 at 12:25
-
No problemo. For your answer, note though that it's possible for some other process to create the file in between when you looked in the directory and when you try to create it. – Darius Bacon Apr 09 '14 at 22:56
-
you are right on that, but not sure how to solve that race condition. It's also not an issue when you are sure to be the only one writing with that pattern at that spot – Nuno Furtado Apr 10 '14 at 07:36
-
I added a suggestion over there, though I don't have the details handy. – Darius Bacon Apr 10 '14 at 19:34
JavaScript
Array.prototype.alphanumSort = function(caseInsensitive) {
for (var z = 0, t; t = this[z]; z++) {
this[z] = [], x = 0, y = -1, n = 0, i, j;
while (i = (j = t.charAt(x++)).charCodeAt(0)) {
var m = (i == 46 || (i >=48 && i <= 57));
if (m !== n) {
this[z][++y] = "";
n = m;
}
this[z][y] += j;
}
}
this.sort(function(a, b) {
for (var x = 0, aa, bb; (aa = a[x]) && (bb = b[x]); x++) {
if (caseInsensitive) {
aa = aa.toLowerCase();
bb = bb.toLowerCase();
}
if (aa !== bb) {
var c = Number(aa), d = Number(bb);
if (c == aa && d == bb) {
return c - d;
} else return (aa > bb) ? 1 : -1;
}
}
return a.length - b.length;
});
for (var z = 0; z < this.length; z++)
this[z] = this[z].join("");
}

- 57,995
- 32
- 132
- 151
-
The third line of this code differs from the source it was taken from. The third line should be replaced by: this[z] = []; var x = 0, y = -1, n = 0, i, j; – Bryce Thomas May 18 '11 at 06:42
For MySQL, I personally use code from a Drupal module, which is available at hhttp://drupalcode.org/project/natsort.git/blob/refs/heads/5.x-1.x:/natsort.install.mysql
Basically, you execute the posted SQL script to create functions, and then use ORDER BY natsort_canon(field_name, 'natural')
Here's a readme about the function: http://drupalcode.org/project/natsort.git/blob/refs/heads/5.x-1.x:/README.txt
If the OP is asking about idomatic sorting expressions, then not all languages have a natural expression built in. For c I'd go to <stdlib.h>
and use qsort
. Something on the lines of :
/* non-functional mess deleted */
to sort the arguments into lexical order. Unfortunately this idiom is rather hard to parse for those not used the ways of c.
Suitably chastened by the downvote, I actually read the linked article. Mea culpa.
In anycase the original code did not work, except in the single case I tested. Damn. Plain vanilla c does not have this function, nor is it in any of the usual libraries.
The code below sorts the command line arguments in the natural way as linked. Caveat emptor as it is only lightly tested.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>
int naturalstrcmp(const char **s1, const char **s2);
int main(int argc, char **argv){
/* Sort the command line arguments in place */
qsort(&argv[1],argc-1,sizeof(char*),
(int(*)(const void *, const void *))naturalstrcmp);
while(--argc){
printf("%s\n",(++argv)[0]);
};
}
int naturalstrcmp(const char **s1p, const char **s2p){
if ((NULL == s1p) || (NULL == *s1p)) {
if ((NULL == s2p) || (NULL == *s2p)) return 0;
return 1;
};
if ((NULL == s2p) || (NULL == *s2p)) return -1;
const char *s1=*s1p;
const char *s2=*s2p;
do {
if (isdigit(s1[0]) && isdigit(s2[0])){
/* Compare numbers as numbers */
int c1 = strspn(s1,"0123456789"); /* Could be more efficient here... */
int c2 = strspn(s2,"0123456789");
if (c1 > c2) {
return 1;
} else if (c1 < c2) {
return -1;
};
/* the digit strings have equal length, so compare digit by digit */
while (c1--) {
if (s1[0] > s2[0]){
return 1;
} else if (s1[0] < s2[0]){
return -1;
};
s1++;
s2++;
};
} else if (s1[0] > s2[0]){
return 1;
} else if (s1[0] < s2[0]){
return -1;
};
s1++;
s2++;
} while ( (s1!='\0') || (s2!='\0') );
return 0;
}
This approach is pretty brute force, but it is simple and can probably be duplicated in any imperative language.

- 98,632
- 24
- 142
- 234
In C, this solution correctly handles numbers with leading zeroes:
#include <stdlib.h>
#include <ctype.h>
/* like strcmp but compare sequences of digits numerically */
int strcmpbynum(const char *s1, const char *s2) {
for (;;) {
if (*s2 == '\0')
return *s1 != '\0';
else if (*s1 == '\0')
return 1;
else if (!(isdigit(*s1) && isdigit(*s2))) {
if (*s1 != *s2)
return (int)*s1 - (int)*s2;
else
(++s1, ++s2);
} else {
char *lim1, *lim2;
unsigned long n1 = strtoul(s1, &lim1, 10);
unsigned long n2 = strtoul(s2, &lim2, 10);
if (n1 > n2)
return 1;
else if (n1 < n2)
return -1;
s1 = lim1;
s2 = lim2;
}
}
}
If you want to use it with qsort
, use this auxiliary function:
static int compare(const void *p1, const void *p2) {
const char * const *ps1 = p1;
const char * const *ps2 = p2;
return strcmpbynum(*ps1, *ps2);
}
And you can do something on the order of
char *lines = ...;
qsort(lines, next, sizeof(lines[0]), compare);

- 198,648
- 61
- 360
- 533
I just use StrCmpLogicalW. It does exactly what Jeff is wanting, since it's the same API that explorer uses. Admittedly, it's not portable.
In C++:
bool NaturalLess(const wstring &lhs, const wstring &rhs)
{
return StrCmpLogicalW(lhs.c_str(), rhs.c_str()) < 0;
}
vector<wstring> strings;
// ... load the strings
sort(strings.begin(), strings.end(), &NaturalLess);

- 44,851
- 20
- 112
- 171
Just a link to some nice work in Common Lisp by Eric Normand:
http://www.lispcast.com/wordpress/2007/12/human-order-sorting/

- 50,694
- 11
- 78
- 122
In C++ I use this example code to do natural sorting. The code requires the boost library.

- 2,321
- 3
- 21
- 24
Python, using itertools:
def natural_key(s):
return tuple(
int(''.join(chars)) if isdigit else ''.join(chars)
for isdigit, chars in itertools.groupby(s, str.isdigit)
)
Result:
>>> natural_key('abc-123foo456.xyz')
('abc-', 123, 'foo', 456, '.xyz')
Sorting:
>>> sorted(['1.1.1', '1.10.4', '1.5.0', '42.1.0', '9', 'banana'], key=natural_key)
['1.1.1', '1.5.0', '1.10.4', '9', '42.1.0', 'banana']

- 4,901
- 1
- 25
- 18
My implementation on Clojure 1.1:
(ns alphanumeric-sort
(:import [java.util.regex Pattern]))
(defn comp-alpha-numerical
"Compare two strings alphanumerically."
[a b]
(let [regex (Pattern/compile "[\\d]+|[a-zA-Z]+")
sa (re-seq regex a)
sb (re-seq regex b)]
(loop [seqa sa seqb sb]
(let [counta (count seqa)
countb (count seqb)]
(if-not (not-any? zero? [counta countb]) (- counta countb)
(let [c (first seqa)
d (first seqb)
c1 (read-string c)
d1 (read-string d)]
(if (every? integer? [c1 d1])
(def result (compare c1 d1)) (def result (compare c d)))
(if-not (= 0 result) result (recur (rest seqa) (rest seqb)))))))))
(sort comp-alpha-numerical ["a1" "a003" "b2" "a10" "a2" "1" "10" "20" "2" "c100"])
Result:
("1" "2" "10" "20" "a1" "a2" "a003" "a10" "b2" "c100")

- 314
- 3
- 10
Note that for most such questions, you can just consult the Rosetta Code Wiki. I adapted my answer from the entry for sorting integers.
In a system's programming language doing something like this is generally going to be uglier than with a specialzed string-handling language. Fortunately for Ada, the most recent version has a library routine for just this kind of task.
For Ada 2005 I believe you could do something along the following lines (warning, not compiled!):
type String_Array is array(Natural range <>) of Ada.Strings.Unbounded.Unbounded_String;
function "<" (L, R : Ada.Strings.Unbounded.Unbounded_String) return boolean is
begin
--// Natural ordering predicate here. Sorry to cheat in this part, but
--// I don't exactly grok the requirement for "natural" ordering. Fill in
--// your proper code here.
end "<";
procedure Sort is new Ada.Containers.Generic_Array_Sort
(Index_Type => Natural;
Element_Type => Ada.Strings.Unbounded.Unbounded_String,
Array_Type => String_Array
);
Example use:
using Ada.Strings.Unbounded;
Example : String_Array := (To_Unbounded_String ("Joe"),
To_Unbounded_String ("Jim"),
To_Unbounded_String ("Jane"),
To_Unbounded_String ("Fred"),
To_Unbounded_String ("Bertha"),
To_Unbounded_String ("Joesphus"),
To_Unbounded_String ("Jonesey"));
begin
Sort (Example);
...
end;

- 44,016
- 10
- 73
- 134
php has a easy function "natsort" to do that,and I implements it by myself:
<?php
$temp_files = array('+====','-==',"temp15-txt","temp10.txt",
"temp1.txt","tempe22.txt","temp2.txt");
$my_arr = $temp_files;
natsort($temp_files);
echo "Natural order: ";
print_r($temp_files);
echo "My Natural order: ";
usort($my_arr,'my_nat_func');
print_r($my_arr);
function is_alpha($a){
return $a>='0'&&$a<='9' ;
}
function my_nat_func($a,$b){
if(preg_match('/[0-9]/',$a)){
if(preg_match('/[0-9]/',$b)){
$i=0;
while(!is_alpha($a[$i])) ++$i;
$m = intval(substr($a,$i));
$i=0;
while(!is_alpha($b[$i])) ++$i;
$n = intval(substr($b,$i));
return $m>$n?1:($m==$n?0:-1);
}
return 1;
}else{
if(preg_match('/[0-9]/',$b)){
return -1;
}
return $a>$b?1:($a==$b?0:-1);
}
}

- 120
- 5
Java solution:-
This can be achieved by implementing new Comparator<String>
and pass it to Collections.sort(list, comparator)
method.
@Override
public int compare(String s1, String s2) {
int len1 = s1.length();
int len2 = s2.length();
int lim = Math.min(len1, len2);
char v1[] = s1.toCharArray();
char v2[] = s2.toCharArray();
int k = 0;
while (k < lim) {
char c1 = v1[k];
char c2 = v2[k];
if (c1 != c2) {
if(this.isInteger(c1) && this.isInteger(c2)) {
int i1 = grabContinousInteger(v1, k);
int i2 = grabContinousInteger(v2, k);
return i1 - i2;
}
return c1 - c2;
}
k++;
}
return len1 - len2;
}
private boolean isInteger(char c) {
return c >= 48 && c <= 57; // ascii value 0-9
}
private int grabContinousInteger(char[] arr, int k) {
int i = k;
while(i < arr.length && this.isInteger(arr[i])) {
i++;
}
return Integer.parseInt(new String(arr, k, i - k));
}

- 433
- 1
- 6
- 11
-
You show use of *magic numbers*. (Java `char` literals are Java integral values/`char`/`Character` is an integral type:) let `isInteger(char c)` just `return '0' <= c && c <= '9'`. – greybeard Mar 29 '22 at 21:48
-
For Tcl, the -dict (dictionary) option to lsort:
% lsort -dict {a b 1 c 2 d 13}
1 2 13 a b c d

- 16,034
- 7
- 51
- 41