64

This is the famous problem of "ASCIIbetical" order versus "Natural" order as applied to powershell. To be able to sort in powershell the same way as explorer does, you can use this wrapper over StrCmpLogicalW API, that actually performs the natural sorting for Windows Explorer. This will require some plumbing though.

However, this article suggests that there is a three liner implementation of the sort in python. One would hope that Get-ChildItem cmdlet or at least File System Provider can have built-in natural sorting option. Unfortunately, they do not.

So here is the question, what is simplest implementation of this in Powershell? By simple I mean the least amount of code to write, and possibly no third-party/external scripts/components. Ideally I want a short Powershell function that would do the sorting for me.

Andrew Savinykh
  • 25,351
  • 17
  • 103
  • 158
  • Thanks to the interesting question, I have just tried to use a regular expression match evaluator in PowerShell. And it actually works! I should think of other applications, then. – Roman Kuzmin Mar 25 '11 at 06:23
  • FYI, the three-liner won't work like Windows Explorer. `1` will be sorted before `+1`, which is not the case in Explorer. – user136036 Jan 29 '20 at 12:21

4 Answers4

113

TL;DR

Get-ChildItem | Sort-Object { [regex]::Replace($_.Name, '\d+', { $args[0].Value.PadLeft(20) }) }

Here is some very short code (just the $ToNatural script block) that does the trick with a regular expression and a match evaluator in order to pad the numbers with spaces. Then we sort the input with padded numbers as usual and actually get natural order as a result.

$ToNatural = { [regex]::Replace($_, '\d+', { $args[0].Value.PadLeft(20) }) }

'----- test 1 ASCIIbetical order'
Get-Content list.txt | Sort-Object

'----- test 2 input with padded numbers'
Get-Content list.txt | %{ . $ToNatural }

'----- test 3 Natural order: sorted with padded numbers'
Get-Content list.txt | Sort-Object $ToNatural

Output:

----- test 1 ASCIIbetical order
1.txt
10.txt
3.txt
a10b1.txt
a1b1.txt
a2b1.txt
a2b11.txt
a2b2.txt
b1.txt
b10.txt
b2.txt
----- test 2 input with padded numbers
                   1.txt
                  10.txt
                   3.txt
a                  10b                   1.txt
a                   1b                   1.txt
a                   2b                   1.txt
a                   2b                  11.txt
a                   2b                   2.txt
b                   1.txt
b                  10.txt
b                   2.txt
----- test 3 Natural order: sorted with padded numbers
1.txt
3.txt
10.txt
a1b1.txt
a2b1.txt
a2b2.txt
a2b11.txt
a10b1.txt
b1.txt
b2.txt
b10.txt

And finally we use this one-liner to sort files by names in natural order:

Get-ChildItem | Sort-Object { [regex]::Replace($_.Name, '\d+', { $args[0].Value.PadLeft(20) }) }

Output:

    Directory: C:\TEMP\_110325_063356

Mode                LastWriteTime     Length Name                                                                                                                  
----                -------------     ------ ----                                                                                                                  
-a---        2011-03-25     06:34          8 1.txt                                                                                                                 
-a---        2011-03-25     06:34          8 3.txt                                                                                                                 
-a---        2011-03-25     06:34          8 10.txt                                                                                                                
-a---        2011-03-25     06:34          8 a1b1.txt                                                                                                              
-a---        2011-03-25     06:34          8 a2b1.txt                                                                                                              
-a---        2011-03-25     06:34          8 a2b2.txt                                                                                                              
-a---        2011-03-25     06:34          8 a2b11.txt                                                                                                             
-a---        2011-03-25     06:34          8 a10b1.txt                                                                                                             
-a---        2011-03-25     06:34          8 b1.txt                                                                                                                
-a---        2011-03-25     06:34          8 b2.txt                                                                                                                
-a---        2011-03-25     06:34          8 b10.txt                                                                                                               
-a---        2011-03-25     04:54         99 list.txt                                                                                                              
-a---        2011-03-25     06:05        346 sort-natural.ps1                                                                                                      
-a---        2011-03-25     06:35         96 test.ps1                                                                                                              
Roman Kuzmin
  • 40,627
  • 11
  • 95
  • 117
  • 6
    You can save that $ToNatural scriptblock in your profile, then it will be available to you whenever you want it. – JasonMArcher Mar 26 '11 at 17:26
  • 1
    I tried this and found it does not work when you have something like: 1.1 Testfile.txt 1.2 Testfile.txt 1.10 Testfile.txt While it does display in this order in an explorer window. – Jim Feb 18 '20 at 17:43
  • It works for me, I have just tried. Are you expecting sort by "floating point numbers"? Then a different script is needed. – Roman Kuzmin Feb 19 '20 at 04:44
  • 1
    That one-liner is gold. You should put it at the top of your answer as a TL;DR. – Damian Powell May 12 '22 at 11:48
  • It isn't perfect. Consider: `0fe68296`, `01b7f045` and `1a2a5773`. In Explorer, when sorting by name, they are in the order above. Yet, the code in the answer makes it sort into: `0fe68296`, `1a2a5773`, `01b7f045`. – AgainPsychoX Aug 23 '23 at 06:29
  • I made gist to show that issue: https://gist.github.com/AgainPsychoX/4e6aabc8e8832a2afde1386d038eda31 – AgainPsychoX Aug 23 '23 at 06:37
9

Allow me to copy and paste my answer from another question.

Powershell Sort-Object Name with numbers doesn't properly

Windows explorer is using a legacy API from shlwapi.dll which called StrCmpLogicalW, that's the reason seeing different sorting results.

I don't want to pad zeros, so wrote a script.

https://github.com/LarrysGIT/Powershell-Natural-sort

Since I am not a C# expert, pull requests are appreciated if it's not tidy.

Find following PowerShell script, it uses the same API.

function Sort-Naturally
{
    PARAM(
        [System.Collections.ArrayList]$Array,
        [switch]$Descending
    )

    Add-Type -TypeDefinition @'
using System;
using System.Collections;
using System.Collections.Generic;
using System.Runtime.InteropServices;
namespace NaturalSort {
    public static class NaturalSort
    {
        [DllImport("shlwapi.dll", CharSet = CharSet.Unicode)]
        public static extern int StrCmpLogicalW(string psz1, string psz2);
        public static System.Collections.ArrayList Sort(System.Collections.ArrayList foo)
        {
            foo.Sort(new NaturalStringComparer());
            return foo;
        }
    }
    public class NaturalStringComparer : IComparer
    {
        public int Compare(object x, object y)
        {
            return NaturalSort.StrCmpLogicalW(x.ToString(), y.ToString());
        }
    }
}
'@
    $Array.Sort((New-Object NaturalSort.NaturalStringComparer))
    if($Descending)
    {
        $Array.Reverse()
    }
    return $Array
}

Find test results below.

PS> # Natural sort
PS> . .\NaturalSort.ps1
PS> Sort-Naturally -Array @('2', '1', '11')
1
2
11
PS> # If regular sort is being used
PS> @('2', '1', '11') | Sort-Object
1
11
2

PS> # Not good
PS> $t = ls .\testfiles\*.txt
PS> $t | Sort-Object
1.txt
10.txt
2.txt

PS> # Good
PS> Sort-Naturally -Array $t
1.txt
2.txt
10.txt
Larry Song
  • 1,086
  • 9
  • 13
  • 2
    Can you give advantages/disadvantages of your answer as compared to the accepted answer? – Andrew Savinykh Jan 19 '18 at 04:26
  • 3
    Not the OP but... Pros: + Battle Tested Implementation + Guaranteed to behave exactly the way windows explorer does. + Doesn't use padding so should handle arbitrarily large numbers. (20+) Cons: - Depends on windows DLL so Linux versions of Powershell probably can't use this. - Always possible they could remove DLL one day from Windows. Neutral: * Couldn't find any appreciable speed difference between this and regex version. – RiverHeart Jun 03 '18 at 14:56
  • 1
    Actually, looking at it again, the C# solution does seem faster on multiple iterations. – RiverHeart Jun 03 '18 at 15:41
  • 1
    This was the only answer in here that actually sorted the same way File Explorer does. Thank you. – WillB3 Jan 30 '19 at 13:15
  • @WillB3 a few people state this but with no support / details. Can you give an example that sorts correctly by this but not by the accepted answer? – Andrew Savinykh Jul 20 '22 at 01:29
  • @Larry Song Unfortunately, I was not able to get this to run. Sort-Naturally : The term 'Sort-Naturally' is not recognized as the name of a cmdlet... Any thoughts on why and how do I use the descending option? – dashrader Jul 27 '23 at 16:24
7

I prefer @Larry Song's answer because it sorts exactly the way Windows Explorer does. I tried to simplify it a little to make it less intrusive.

Add-Type -TypeDefinition @"
using System.Runtime.InteropServices;
public static class NaturalSort
{
    [DllImport("Shlwapi.dll", CharSet = CharSet.Unicode)]
    private static extern int StrCmpLogicalW(string psz1, string psz2);
    public static string[] Sort(string[] array)
    {
        System.Array.Sort(array, (psz1, psz2) => StrCmpLogicalW(psz1, psz2));
        return array;
    }
}
"@

Then you can use it like:

$array = ('1.jpg', '10.jpg', '2.jpg')
[NaturalSort]::Sort($array)

which outputs:

1.jpg
2.jpg
10.jpg
Elderry
  • 1,902
  • 5
  • 31
  • 45
  • Several people voiced concern that the accepted answer does NOT sort the same way Explorer does. Do you happen to have an example? – Andrew Savinykh Jan 01 '21 at 06:49
  • @AndrewSavinykh, yes I do. You can try `1.txt` vs `1_.txt`. – Elderry Jan 01 '21 at 07:53
  • @Elderry I was not able to get Larry Song's solution to work. I need to reverse the sort. Can you add that as an option and show an example of using it, please? – dashrader Jul 27 '23 at 16:23
  • You can always [reverse an array](https://devblogs.microsoft.com/scripting/powertip-use-powershell-to-reverse-array/#:~:text=How%20can%20I%20use%20Windows%20PowerShell%20to%20reverse,%3D%201%2C2%2C3%2C4%2C5%20%24b%20%3D%20%24a%20%7C%20sort%20-Descending) after you get the result. @dashboard – Elderry Aug 02 '23 at 06:31
3

Translation from python to PowerShell works pretty well:

function sort-dir {
    param($dir)
    $toarray = {
        @($_.BaseName -split '(\d+)' | ?{$_} |
        % { if ([int]::TryParse($_,[ref]$null)) { [int]$_ } else { $_ } })
    }
    gci $dir | sort -Property $toarray
}

#try it
mkdir $env:TEMP\mytestsodir
1..10 + 100..105 | % { '' | Set-Content $env:TEMP\mytestsodir\$_.txt }
sort-dir $env:TEMP\mytestsodir
Remove-Item $env:TEMP\mytestsodir -Recurse

You can do it even better when you use Proxy function approach. You add -natur parameter to Sort-Object and you have pretty beautiful solution.

Update: First I was quite surprised that PowerShell handles comparing arrays in this way. After I tried to create test files ("a0", "a100", "a2") + 1..10 + 100..105 | % { '' | Set-Content $env:TEMP\mytestsodir\$_.txt }, it turned out that it doesn't work. So, I think there is no elegant solution like, because PowerShell is static under the covers, whereas python is dynamic.

stej
  • 28,745
  • 11
  • 71
  • 104