21

I'm looking for an algorithm that sorts strings similar to the way files (and folders) are sorted in Windows Explorer. It seems that numeric values in strings are taken into account when sorted which results in something like

name 1, name 2, name 10

instead of

name 1, name 10, name 2

which you get with a regular string comparison.

I was about to start writing this myself but wanted to check if anyone had done this before and was willing to share some code or insights. The way I would approach this would be to add leading zeros to the numeric values in the name before comparing them. This would result in something like

name 00001, name 00010, name 00002

which when sorted with a regular string sort would give me the correct result.

Any ideas?

Christophe Herreman
  • 15,895
  • 9
  • 58
  • 86

8 Answers8

16

It's called "natural sort order". Jeff had a pretty extensive blog entry on it a while ago, which describes the difficulties you might overlook and has links to several implementations.

Robert Harvey
  • 178,213
  • 47
  • 333
  • 501
Michael Borgwardt
  • 342,105
  • 78
  • 482
  • 720
7

Explorer uses the API StrCmpLogicalW() for this kind of sorting (called 'natural sort order').

You don't need to write your own comparison function, just use the one that already exists.

A good explanation can be found here.

Community
  • 1
  • 1
Stefan
  • 43,293
  • 10
  • 75
  • 117
3

There is StrCmpLogicalW, but it's only available starting with Windows XP and only implemented as Unicode.

Some background information: http://www.siao2.com/2006/10/01/778990.aspx

Community
  • 1
  • 1
Otherside
  • 2,805
  • 22
  • 21
1

This is a try to implement it in Java:

Java - Sort Strings like Windows Explorer

In short it splits the two Strings to compare in Letter - Digit Parts and compares this parts in a specific way to achieve this kind of sorting.

Community
  • 1
  • 1
wumpz
  • 8,257
  • 3
  • 30
  • 25
1

The way I understood it, Windows Explorer sorts as per your second example - it's always irritated me hugely that the ordering comes out 1, 10, 2. That's why most apps which write lots of files (like batch apps) always use fixed length filenames with leading 0's or whatever.

Your solution should work, but you'd need to be careful where the numbers were in the filename, and probably only use your approach if they were at the very end.

xan
  • 7,440
  • 8
  • 43
  • 65
1

Have a look at

http://www.interact-sw.co.uk/iangblog/2007/12/13/natural-sorting

for some source code.

WOPR
  • 5,313
  • 6
  • 47
  • 63
1

I also posted a related question with additional hints and pitfalls:

Sorting strings is much harder than you thought

Community
  • 1
  • 1
Carl Seleborg
  • 13,125
  • 11
  • 58
  • 70
1

I posted code (C#) and a description of the algorithm here:

Natural Sort Order in C#

Community
  • 1
  • 1
J.D.
  • 2,114
  • 17
  • 13