4

PowerShell doesn't seem to sort objects by name ascending correctly:

dir | Sort-Object Name

enter image description here

My goal is to sort the elements in this order:

1.sql
2.sql
3.sql
...
9.sql
10.sql
11.sql
...

Is there a way to solve this?

sɐunıɔןɐqɐp
  • 3,332
  • 15
  • 36
  • 40
Francesco Mantovani
  • 10,216
  • 13
  • 73
  • 113

3 Answers3

7

You need to sort the filenames as numbers, rather than as text. It's convenient because the filenames in your example are entirely numbers, so you can change the sort to use a dynamic scriptblock property, which will evaluate the filename to a number for each item in the pipeline:

| sort-object -Property {
    if (($i = $_.BaseName -as [int])) { $i } else { $_ }
}

That means: if the filename can be converted to an integer, then use that, otherwise use it as it is.

sɐunıɔןɐqɐp
  • 3,332
  • 15
  • 36
  • 40
TessellatingHeckler
  • 27,511
  • 4
  • 48
  • 87
  • I want to sort `Get-Item env:|sort -Property name`, but it sorts similar to just `get-item env:`. Is this a new question or can you answer here? – Timo Mar 07 '21 at 19:42
  • 1
    @Timo I think the difference is between `Get-Item env:` and `Get-ChildItem env:`, the first returns a single ValueCollection, the second returns an array of entries. I'm not sure on the details, but `Env:` is a PowerShell Provider pretending to be a drive like `C:`, so it's like doing `Get-Item C:` and expecting to get all the files inside. You can do it with `Get-ChildItem Env: | Sort-Object -Property Name` or something like `(Get-Item Env:).GetEnumerator() | Sort-Object -Property Name`. – TessellatingHeckler Mar 10 '21 at 23:58
  • get-childitem put me on the right track. Why is `(Get-Item c:).GetEnumerator()` not working? (IT is like one question leads to another;)) Probably because get-item c: returns one object, so there is no enumerator. Anyway, just `gci env:` works – Timo Mar 12 '21 at 19:26
  • 1
    @Timo I'm guessing a lot: `Get-Item C:` is type `System.IO.DirectoryInfo`, the same as any directory, which goes back to early .Net and C#. `Env:` is PowerShell only, it's not a filesystem directory so must be another type, and what they used (ValueCollection) has an enumerator. I guess DirectoryInfo doesn't have one because it's more complex than "contents" only - see `(get-item c:\)|gm enumerate*, get*` and there are methods to enumerate Files, Directories, both, overloads for with/without subdirectories, Access control lists, etc. Too much for one enumerator. – TessellatingHeckler Mar 12 '21 at 21:17
1

For more complex patterns enclosed in alphabetic characters use $ToNatural which expands all embedded numbers to a unique length (here 20) by left padding with zeros.

Source: Roman Kuzmin - How to sort by file name the same way Windows Explorer does?

$ToNatural = { [regex]::Replace($_, '\d+', { $args[0].Value.PadLeft(20,"0") }) }

Generate some test data:

> (10..11 + 100..101 + 1..2)|%{new-item -itemtype file -path ("pre_{0}_post.sql" -f $_)
    Directory: A:\

Mode                LastWriteTime         Length Name
----                -------------         ------ ----
-a----       2018-07-27     11:02              0 pre_10_post.sql
-a----       2018-07-27     11:02              0 pre_11_post.sql
-a----       2018-07-27     11:02              0 pre_100_post.sql
-a----       2018-07-27     11:02              0 pre_101_post.sql
-a----       2018-07-27     11:02              0 pre_1_post.sql
-a----       2018-07-27     11:02              0 pre_2_post.sql

> dir| sort
    Directory(: A:\

Mode                LastWriteTime         Length Name
----                -------------         ------ ----
-a----       2018-07-27     11:02              0 pre_1_post.sql
-a----       2018-07-27     11:02              0 pre_10_post.sql
-a----       2018-07-27     11:02              0 pre_100_post.sql
-a----       2018-07-27     11:02              0 pre_101_post.sql
-a----       2018-07-27     11:02              0 pre_11_post.sql
-a----       2018-07-27     11:02              0 pre_2_post.sql

> dir|sort $ToNatural
    Directory: A:\

Mode                LastWriteTime         Length Name
----                -------------         ------ ----
-a----       2018-07-27     11:02              0 pre_1_post.sql
-a----       2018-07-27     11:02              0 pre_2_post.sql
-a----       2018-07-27     11:02              0 pre_10_post.sql
-a----       2018-07-27     11:02              0 pre_11_post.sql
-a----       2018-07-27     11:02              0 pre_100_post.sql
-a----       2018-07-27     11:02              0 pre_101_post.sql
sɐunıɔןɐqɐp
  • 3,332
  • 15
  • 36
  • 40
0

Others have already provided solutions which would work, here I would like to provide an explanation about what it does and how to utilize it.

Actually it sorts by name correctly, the name of the sorting method called is "lexicographic ordering": https://en.wikipedia.org/wiki/Lexicographic_order

Every letter has a value in this ordering, and two words are compared character by character at the same position, and when the first one in one of these has a lower value, then that is picked as the string that has a "lower" value and hence appears sooner in the list than the second one.

In your list, this is the reason why all the names that start with a '1' appear before those that start with '2','3',...,etc..

You can utilize this actually in order to achieve the ordering that you want without having to write any script whatsoever.

By prepending every file with a '0' that has an "id" less than 10, you would achieve the following output:

01.sql
02.sql
03.sql
...
09.sql
10.sql
11.sql

because the value of the character '0' is less than the value of '1', the same goes for everything else, and it can be expanded to higher numbers, if you would have 100 or more files, every lesser value file you would prepend with a '00' and that would give you the correct ordering again.

I am using this for dates as well sometimes, the important thing there to understand is that only the ISO date format is lexicographically orderable, since lexicographic ordering requires that the most significant characters appear first and then gradually every character has less significance, i.e. the ISO format hence looks like this year-month-day, instead of the more usual format day.month.year, or the more weird ones month/day/year.

Hope it helps.